1

Gradient descent can get stuck into local optimum. Which techniques are there to reach global optimum?

nbro
  • 40,472
  • 12
  • 105
  • 192
hina munir
  • 187
  • 1
  • 3

1 Answers1

1

In Deep Learning there are several methods to improve "stuck" gradient - decrease learning rate, use cyclic learning rate - cycle it from bigger to smaller value. More radical method is completely reinitialize last or two last (before loss) layers of the network.

In non-Deep Learning ML out of those only decrease learning rate will work, but there is plethora Numerical Optimization methods to help, like second-order methods - variation of Gauss-Newton, or methods specific to the problem which may include projective methods, alternate directions, conjugate gradients etc. There are a lot of methods which are better then gradient descent for non-Deep Learning optimization.

mirror2image
  • 725
  • 7
  • 15