8

Can anybody help me? I heard that Gauss-Newton method compute an aproximation of the Hessian instead of the true Hessian, but, quasi-Newton method too, don't it? what is the differences between them?

any help would be appreciated :)

mario faixat
  • 81
  • 1
  • 2
  • 1
    Related: http://scicomp.stackexchange.com/a/10727. Also, http://www.sciencedirect.com/science/article/pii/S0926985101001069 (discussing the practical difference for a concrete problem). – Christian Clason Oct 15 '15 at 17:31

1 Answers1

17

Quasi-Newton methods construct an approximate Hessian for an arbitrary smooth objective function $f(x)$ using values of $\nabla f$ evaluated at the current and previous points. At each iteration of the method the quasi-Newton approximate Hessian is updated using the gradient evaluated at the latest iterate, $x^{(k)}$. These approximate Hessians aren't necessarily very good approximations to the actual Hessian of $f$, but they are sufficiently good that you can use them to obtain good search directions in the optimization algorithm. There are many quasi-Newton methods, of which the most popular is probably BFGS (Broyden-Fletcher-Goldfarb-Shanno.)

The Gauss-Newton method is an approximate Newtons method that works only with objective functions that can be expressed as a sum of squares

$ f(x)=\sum_{i=1}^{m} f_{i}(x)^{2} $

For objective functions of this particular form, if you let $J(x)$ (the Jacobian) be the matrix of first partial derivatives of $f_{i}(x)$ with respect to $x_{j}$, and let $F(x)$ be the vector of $f_{1}(x)$, $f_{2}(x)$, $\ldots$, $f_{m}(x)$, then

$ \nabla f(x)=J(x)^{T}F(x) $

and

$ \nabla^{2} f(x) \approx 2J(x)^{T}J(x). $

The Gauss-Newton method (and the Levenberg-Marquardt method) use this approximate Hessian and exact gradient in Newton's method.

The approximate Hessian in the Gauss-Newton method is not of the same type as the quasi-Newton approximate Hessians (BFGS, DFP, etc.)

Note that one can apply quasi-Newton methods like BFGS to nonlinear least squares problems but you can't apply the Gauss-Newton method to minimizing a general function $f(x)$.

Brian Borchers
  • 18,719
  • 1
  • 39
  • 70