General question related to BFGS

Question

Based on my basic understanding of the BFGS method, the algorithm will iterate until the gradient norm is less than or equal to a set value called "gtol" in the case of Python.

However, when using this method, and checking the output the following is showing:

Iterations: 2 Function evaluations: 76 Gradient evaluations: 13

"Desired error not necessarily achieved due to precision loss."

This got me confused. Shouldn't the algorithm iterate multiple times to reach convergence? How did it reach convergence by only iterating twice? In this case, why can we specify a maximum number of iterations if the algorithm converges this quickly?

If the algorithm was started very close to a local minimum, then it could very easily have converged in two iterations. Have you checked whether the norm of the gradient satisfies the tolerance at the point returned by the routine? — Brian Borchers, Feb 04 '21 at 15:51
I am not sure how to check using python but I am assuming that yes the tolerance criteria was satisfied. However, I cannot comprehend how the algorithm evaluated the objective function 76 times in two iterations. Shouldn't both occur almost equally? Since the main loop of the BFGS algorithm is to iterate from k=0 to k=n until the tolerance is satisfied. — Habib, Feb 04 '21 at 16:04
I would also like to add that I received this message when the optimization was terminated: "Desired error not necessarily achieved due to precision loss." — Habib, Feb 04 '21 at 16:07
Does this answer your question? scipy.optimize.fmin_bfgs: "Desired error not necessarily achieved due to precision loss" — Brian Borchers, Feb 04 '21 at 16:32
Actually, it's quite likely that you're not at a minimum- the error message shows a clear failure. Rather, there could be an error in your implementation of the gradient, or you could be dealing with an objective function that doesn't satisfy the smoothness requirements of the method. You should check your gradient routine for errors. If you could provide more information about your objective function we might be able to provide more help. — Brian Borchers, Feb 04 '21 at 16:35
That's the original issue. Based on my objective function, I am not so sure if I can use BFGS or any other gradient-based algorithm. Thus, I was trying to see if using BFGS might work. My objective function is the following for a specified n: $$\sum_{i=1}^n \sqrt{(X_{i+1}-X_i)^2+(Y_{i+1}-Y_i)^2+(Z_{i+1}-Z_i)^2}$$ — Habib, Feb 04 '21 at 16:58
That's nonsmooth. I would suggest that you create a new questions including this information and ask for suggestions on how to solve the optimization problem. — Brian Borchers, Feb 04 '21 at 18:39
I have two questions regarding this matter: 1- How can I prove that this is a non-smooth function? I need a bulletproof claim for this. 2- Can gradient-based algorithms be used to solve non-smooth functions? — Habib, Feb 04 '21 at 19:18
Non-smoothness should be trivial, set $Y_i=Y$ for all $i$ and set $Z_i=Z$ for all $i$. Also let $X_i = X$ for all $i\neq n+1$. Then the function you are interested in reduces to $\sqrt{(X_{n+1}-X)^2} = | X_{n+1}-X |$, which is continuous but not continuously differentiable, hence, nonsmooth. — Abdullah Ali Sivas, Feb 05 '21 at 21:21

General question related to BFGS

0 Answers0