4

When generating forecasts (e.g., product-customer time series data), should we choose an average-based forecast or median-based forecast? I recently read a very nice article by Nicholas Vandeput on LinkedIn wherein he linked the forecast type to use of different best fit selection criteria.

Optimization on RMSE yields an average-based number... whereas on MAE yields a median-based forecast

Forecast KPI: RMSE, MAE, MAPE & Bias

Advantages of using median forecast: robust to outliers

Disadvantages of using median forecast: bad for intermittent time series data, medians can be biased for non-normal data, median forecasts are not additive

Q: If that is the case, should we ever use median-based forecasts?

Q: Alternatively, can we correct the data for outliers through outlier correction or "de-promotionalization" and then generate an average-based forecast?

3 Answers3

3

I'm pretty sure I answered this question before. The answer depends on how you define a better forecast. If you define it as minimizing expected loss (forecast error) then the average will be better for minimizing the square of an error and the median minimizes the absolute value of an error both in expected sense.

Suppose your loss function is $f(y-\hat y)$ then you find the forecast $\hat y$ that minimizes the expected loss as $$\min E[f(y-\hat y) ]$$

It can be shown that for $f(x)=x^2$ your $\hat y=E[y]$ and for $f(x)=|x|$ the best forecast is $\hat y=median(y)$

Aksakal
  • 61,310
2

Honestly your questions has only one good answer which is it depends :D.

However i will try to give you another idea.

You could use median of means, https://arxiv.org/pdf/1711.10306.pdf (don't know if it is the best article to speak about it, again it is just to give the idea).

The main idea behind median of means is what you say, media is robust, means is good when no outliers. So cutting the N variables into K groups where you do means and then take median of this K means should be a not bad idea.

After there is theoretical guarantee it's not stupid. however you need to do the work to adapt your favorite method to the MOM's methodology (medians of mean) since it may not exist.

PauZen
  • 134
2

I would recommend that you first think about what decision your forecast is going to support. I work in retail forecasting, so most of my examples come from this domain:

  • There are business decisions that depend on expectation forecasts. For instance, if you plan a promotion on a limited budget, then you likely want to select the products to be promoted based on expected bottom line contribution.

  • Conversely, there are business decisions that depend on quantile forecasts. The main use case here is setting target stock levels to include safety stocks. (The same idea holds for service levels for service provision rather than sales.)

    For instance, if you can model your situation as a newsvendor situation, then your purchasing price $c$ and sales price $p$ immediately tell you which quantile forecast you need. If $p=2c$, then a median forecast is what you need. (And yes, if your demand is sufficiently intermittent, that median forecast will be a flat zero, and your optimal decision is to not stock anything, Kolassa, 2016.)

  • In more complicated situation, you may need full density forecasts, e.g., if your decision does not line up with your forecasting time buckets. In a retail context, this comes up if you forecast on daily granularity but can only order product twice a week. Per above, your target stock level corresponds to a quantile forecast of cumulative demands over multiple days, so you need to convolve your predictive densities.

  • (While we are at it, I have never seen a business decision that would be optimized by a MAPE-optimal forecast.)

In summary, I would argue that it makes sense to first understand which functional of the future distribution we want to elicit ("forecast"). With Aksakal, that amounts to first deciding what makes a forecast "better" than another one. Only then can we choose an error measure that rewards us for a "better" forecast (Kolassa, 2020).

Stephan Kolassa
  • 123,354