How can we reach global optimum?

Question

Gradient descent can get stuck into local optimum. Which techniques are there to reach global optimum?

Possible duplicate of How to avoid falling into the "local minima" trap? — nbro, Apr 27 '19 at 16:19

mirror2image · Answer 1 · 2019-04-09T08:35:22.947

In Deep Learning there are several methods to improve "stuck" gradient - decrease learning rate, use cyclic learning rate - cycle it from bigger to smaller value. More radical method is completely reinitialize last or two last (before loss) layers of the network.

In non-Deep Learning ML out of those only decrease learning rate will work, but there is plethora Numerical Optimization methods to help, like second-order methods - variation of Gauss-Newton, or methods specific to the problem which may include projective methods, alternate directions, conjugate gradients etc. There are a lot of methods which are better then gradient descent for non-Deep Learning optimization.

How can we reach global optimum?

1 Answers1