Difference between "Online Optimization" and "Stochastic Optimization"/"Robust Optimization"?

Question

I just came across the notion of Online optimization (I got a look on Wikipedia page and some other webpages), but it was not enough for me and I am looking for a more elaborated comparison, namely in terms of the modelling and solving strategies.

I am keen on any academic/industrial pointers to understand better how these types of optimizations work.

Can we apply Machine Learning (in any flavour) to solve Online Optimization problems (depending on the problem on hand)?

Good reading on Online Optimization: https://www.nowpublishers.com/article/Details/MAL-018 — Brannon, Mar 28 '23 at 15:00

PeterD · Answer 1 · 2023-03-27T09:48:24.967

Most online problems are sequential decision problems described by the following scheme:

Information --> Decision --> Information --> Decision --> Information...

We first have some information, make a decision, get new information, make a decision and so on. As the information is (mostly) not known beforehand, we deal with a stochastic process. These problems are often modeled as Markov decision process (MDP).
In terms of solving these sequential decision problems, Warren Powell introduced 4 classes:

Policy function approximation: Often simple rules (e.g. order-up to levels in case of an inventory problem), but also more sophisticated methods that try to optimize a policy directly (Policy Gradient methods).
Cost function approximation: Adjust objective function or constraints to account for the future (e.g., by adding slack).
Value function approximation: Making a decision now using the (expected) value of being in a state
Lookahead-models: Simulate into the future and use that simulated information to make a decision now

Machine learning can be used as a tool in all these four classes and is in fact often used when one applies Value function approximation, i.e. you can try to predict the value of a state with a machine learning model. Or when using Policy function approximation such as Policy gradient methods (you can use machine learning to output an action distribution). If you have read up on reinforcement learning (RL with function approximation) this might sound familiar. I can advise you to check out the following book (draft is freely available):

Reinforcement Learning and Stochastic Optimization: A unified framework for sequential decisions by Warren Powell https://castlelab.princeton.edu/wp-content/uploads/2019/06/Powell-RLandSO-June192019.pdf

Thanks for your answer. Does it mean online optimisation problems are generally solved using RL? or is it more of a trend? — Betty, Feb 01 '22 at 09:38
It is debatable what one calls 'reinforcement learning'. If reinforcement learning refers to approaches strongly related to algorithms such as Q-learning (i.e. estimating a value of a state or state/action pair), then yes, it is more of a trend. Powell states, however, that one can see all 4 classes above as 'reinforcement learning' (see page 4 of his book), but I would argue with that because learning is not always involved. — PeterD, Feb 01 '22 at 10:22
Those four classes are meta-policies. RL may be the most famous paradigm in 3. Value Function Approximation. Those familiar to "math program" people in here, such as Stochastic Programming and Robust optimization, belong to 4. Direct lookahead model. But many hybrid policies can happen. For example, Stochastic Dual Dynamic Programming is kind of Stochastic program but use Value function approximation by using Duality techniques — Seok, Feb 18 '22 at 15:12

score 3 · Answer 2 · answered Feb 10 '22 at 09:40

Pedrinho covered the Online part of your question very well, so I'll answer the other two.

Strictly speaking, when we pose an optimisation problem, solving it means finding its global solution. If the problem is also continuous, finding that solution also means satisfying the conditions for optimality. If the problem is discrete, it means that we need to prove there is no other discrete solution better than the one we found, which we typically do with branch and bound.

This type of classical solving is often not possible in practice, either because the problem is too large or because the equations are black-box.

In these cases, we would use stochastic optimisation methods, i.e., methods that seek to find some feasible solution to our problem without caring about proving much else beyond that. Examples include simulated annealing, evolutionary algorithms, particle swarm optimisation, and so on. Note that it's a bit hard to be exact in this definition, because many methods use a blend of characteristics, e.g., depending on how we do simulated annealing we could also seek to satisfy the optimality conditions like we do in local optimisation. In any case, stochastic methods will never seek to prove that a solution is globally optimal.

The main characteristic of stochastic methods is that there is an element of randomness every time we run them, i.e., we can get a different result every time. Note that this is still not a necessity, as even a particle swarm method could deterministically generate a random seed which would make the results reproducible. Examples would include generating random values for variables or iterates.

Furthermore, I have also seen people use this term to describe something totally different, i.e., creating and solving models that encode real-world stochasticity (e.g. financial modelling, forecasting, and so on). I haven't seen a formal definition to distinguish between the two, but I personally use "stochastic optimisation methods" to refer to the actual algorithmic part, so that it can't be mistaken for the modelling bit.

Along similar lines, I have seen the term Robust Optimisation used in two different ways. The first one is Wikipedia's definition, which says that we seek a certain level of robustness against uncertainty, e.g., to configure a power plant to run such that there is 95% probability that we will be able to meet demand every day. This is very similar to stochastic modelling, the main difference being that here we have a very specific objective, which is not necessarily true in other models that simply include stochasticity.

The second definition is very deterministic: we seek to minimise fluctuations in a system, which can be modelled deterministically. For example, I once worked on optimising aircraft wings such that we achieved maximum lift whilst ensuring that they would never be able to oscillate beyond specifications due to aeroelastic effects. There was nothing stochastic there, as the aeroelasticity equations are deterministic and the turbulence models are very precise. In mechanical engineering, we call that robust optimisation, since we are not simply optimising for performance - we are ensuring that a system will be robust (not break) during its operation.

score 1 · Answer 3 · answered Feb 18 '22 at 15:21

Online optimization can be solved deterministic or in a stochastic manner.

Stochastic optimization is a wide concept which includes Stochastic program, Robust optimization, MDP, ADP, RL, Model predictive control, and so on...

Robust optimization is a famous area of stochastic optimization, which treats uncertainty quite differently than others. It develops an uncertainty set in which the uncertainty is realized, and solves for robust optimal/feasible solutions. Robust feasible means that any solution x is feasible for any realization of uncertain parameters in the uncertainty set, and Robust optimal means that it is optimal among those robust feasible solutions.

Difference between "Online Optimization" and "Stochastic Optimization"/"Robust Optimization"?

3 Answers3

Linked