Speed of convergence for minimizing Rosenbrock's function

Question

I am minimizing $f(x_1,x_2) = 100(x_2-x_1^2)^2 + (1-x_1)^2$, where I try many algorithms to compare with each other. All of the algorithms find the optimal solution $(1,1)$ quickly, so I am not bothered with asymptotics right now.

Let $x = (x_1,x_2)^\top$ and $x^*$ define the optimal solution. We know that $\epsilon= \|x_k - x^*\|$ is the error of iteration $k$ compared to the optimal solution, and in convergence we are usually interested in the relation between $\epsilon_k, \epsilon_{k+1}$.

My question is very simple. Which one is the best option here, to plot $\dfrac{\epsilon_{k+1}}{\epsilon_{k}}$ for each algorithm, or to plot $\epsilon_{k+1} - \epsilon_{k}$? The first one is usually studied in asymptotic convergence, so I'm not sure whether it really makes sense for my simple Rosenbrock minimization.

Why not both? And I would also use several (hundreds) runs with each different starting point and then put the average or median values fur each algorithm. — JakobS, Jan 04 '20 at 18:48

Nikos Kazazakis · Answer 1 · 2020-01-07T08:30:01.540

Considering that all your plots will be for the same problem you don't need to normalise by using relative error, so you can use either representation as long as you use it consistently for everything.

The thing I would consider foremost is the readability of the charts, so I would suggest to plot both and see which representation is better scaled to actually be easy to observe in a plot.

Other than that, the difference between the two ways you are considering is basically whether to plot convergence, or the rate of convergence. Personally, I find convergence (i.e., the difference of values) to be more intuitive because I can tell the rate from the slope of the graph.

Speed of convergence for minimizing Rosenbrock's function

1 Answers1