Questions tagged [optimization]

Use this tag for any use of optimization within statistics.

In statistics, optimization if often used to select an estimator of a parameter by maximizing or minimizing some function of the data. One very common example of optimization in statistics is choosing an estimator which maximizes the joint density (or mass function) of the observed data; this is known as Maximum Likelihood Estimation (MLE).

Optimization is a large field and has many uses outside of estimation. In design of experiments we can choose a design by maximizing a certain determinant (D-optimality). In industrial statistics we can use a fitted response-surface model to optimize the economy or yield of a process. Wikipedia has an article https://en.wikipedia.org/wiki/Mathematical_optimization .

2775 questions
29
votes
2 answers

Can gradient descent be applied to non-convex functions?

I'm just learning about optimization, and having trouble understanding the difference between convex and non-convex optimization. From my understanding, a convex function is one where "the line segment between any two points on the graph of the…
Karnivaurus
  • 7,019
25
votes
4 answers

Why are second-order derivatives useful in convex optimization?

I guess this is a basic question and it has to do with the direction of the gradient itself, but I'm looking for examples where 2nd order methods (e.g. BFGS) are more effective than simple gradient descent.
Bar
  • 2,862
21
votes
2 answers

How to choose the right optimization algorithm?

I need to find the minimum of a function. Reading the docs at http://docs.scipy.org/doc/scipy/reference/optimize.html I see that there are several algorithms that do the same thing, i.e. find the minimum. How do I know which one I should…
siamii
  • 2,057
16
votes
3 answers

Making big, smart(er) bets

I've been trying to code an algorithm to suggest bets in 1X2 (weighted) games. Basically, each game has a set of matches (home vs away teams): 1: home wins X: draw 2: away wins For each match and symbol (1, X and 2), I will assign a percentage…
Alix Axel
  • 289
13
votes
1 answer

Optimization problem

A friend of mine sells $k$ models of blenders. Some of the blenders are very simple and cheap, others are very sophisticated and more expensive. His data consists, for each month, of the prices of each blender (which are fixed by him), and the…
Zen
  • 24,121
11
votes
1 answer

Which optimization algorithm to use for problems with many local optima and expensive goal function?

I've got an optimization problem with these properties: The goal function is not cheap to compute. It can be evaluated up to around 10^4 times in the optimization. There's a lot of local optima. High values are clustered around the maximum, so the…
Anna
  • 339
10
votes
3 answers

Why are numerical solutions preferred to analytical solutions?

I'm just learning about optimization, and the difference between an analytical solution, and a numerical one. Suppose there is a cost function f(x), and we want to find the value of x which minimizes this. In an analytical solution, we would…
Karnivaurus
  • 7,019
8
votes
1 answer

Convergence of L-BFGS in non-convex settings

Is it true that generally L-BFGS may not converge in non-convex settings even if learning rate is really small? For example here L-BFGS diverges, but there are theoretical guarantees on its local convergence. How can this be explained?
7
votes
2 answers

Adadelta idea 1 vs RMSprop

In the Adadelta paper, the first proposed idea, idea 1 seems to me exactly like RMSprop (here or here), although it is not called like that and not referenced. Am I correct?
Albert
  • 1,245
6
votes
1 answer

In mathematical optimization, are sequential quadratic programming and sequential least squares programming the same thing?

The Sequential Quadratic Programming (SQP) that I am talking about is based on here. And the Sequential Least SQuares Programming (SLSQP) is based on SciPy documentation
Nicholas
  • 515
5
votes
3 answers

Can constrained optimization techniques be applied to unconstrained problems?

I am looking at using Interior Point method for optimizing a convex function. The convex function is basically the log-likelihood of a binary logistic regression model. Can I use this technique? In generally, is there anything that prevents applying…
4
votes
1 answer

Adaptive simulated annealing

I have been doing some reading about adaptive simulated annealing and as far as I know it is an algorithm that is really useful when it comes to finding the global maxima/minima of some functions, which is useful for calibration purposes. However,…
AZhu
  • 143
4
votes
1 answer

Can the eigenvalues of the Hessian change sign during a Newton-Raphson optimization?

Can the eigenvalues of the Hessian change sign during a Newton-Raphson optimization? Because from simple examples with 1D functions it seems as if one always converges to the nearest optimum, meaning that one stays within the turning points of the…
4
votes
1 answer

How to optimize a fractional function $\frac{\left|u^HXu\right|^2}{\left|u^HYu\right|^2}$ with $\left\|u\right\|_2 = 1$

How to optimize a fractional function $\frac{|u^H X u|^2}{\left|u^HYu\right|^2}$ with $\left\|u\right\|_2 = 1$? Here, the matrix $X \in C^{n\times n}$ is positive semidefinite, and $Y\in C^{n\times n}$ are positive definite, i.e., $X \succeq 0, Y…
Dony
  • 131
4
votes
3 answers

Proof that more is better in optimization

Today I overheard somebody say something along the lines of: "there is a proof that shows that in optimization, more options are better or at least neutral." I searched but could not find such a proof. Does it sound familiar to anyone? Sorry for the…
invictus
  • 523
1
2 3 4 5 6