2

One of the drawbacks of MAPE is that you can have actuals equal to 0. This will result in a division by 0 and thus an undefined MAPE. Does it make sense to increment by 1 the numerator and denominator to avoid falling in this trap. I am aware that the addition up and down do not simplify mathematically, but I remember seeing this manipulation somewhere.

Does it make sense? if not what is some way to take in consideration an undefined MAPE without discarding the datapoint? Should we rather increment the actuals and forecasts by 1? something else?

lalaland
  • 237
  • https://stats.stackexchange.com/questions/86708/how-to-calculate-relative-error-when-the-true-value-is-zero/201864#201864 gives a thorough account of how to define measures of relative error. You could also perform an analysis like the one I give for logarithms at https://stats.stackexchange.com/a/30749/919. – whuber Sep 09 '22 at 15:25

1 Answers1

0

Well, you can do this. I would very much recommend you take a look at What are the shortcomings of the Mean Absolute Percentage Error (MAPE)? The MAPE elicits forecasts that can be quite far away from the conditional mean (which people usually subconsciously want as a point forecast). A "modified MAPE" along the lines you propose would not be undefined for zero actuals and would elicit a different functional of the future distribution than the (-1)-median the MAPE elicits, but still quite probably not the mean.

For instance, a quick simulation indicates that the optimal forecast (Kolassa, 2020) under the "modified MAPE" for $y\sim\text{Pois}(1)$ is $\hat{y}=0$, which is definitely not equal to the expectation of $y$, which is $1$:

modified MAPE for Pois(1) data

lambda <- 1
yy <- rpois(1e5,lambda)
Forecast <- seq(0,max(yy),by=.01)
modMAPE <- sapply(Forecast,function(xx)mean((abs(yy-xx)+1)/(yy+1)))
plot(Forecast,modMAPE,type="l",las=1)

If we play around a bit with the lambda parameter in this script, it looks like the "modified MAPE" is always minimized in expectation by some integer $\hat{y}^\text{opt}<\lfloor\lambda\rfloor$. If this is the point forecast you want to elicit, go ahead - but to be honest, I have never come across a business problem that would be optimally addressed by such a strange functional of the future distribution. Conditional means or quantiles make much more sense.

An alternative for dealing with undefined MAPEs is to calculate the MAE, divided by the mean of the series, which can be interpreted as a weighted MAPE (Kolassa & Schütz, 2007), but is of course a scaled MAE, so it is minimized by the conditional median of the future distribution - which may or may not be what you want.

Stephan Kolassa
  • 123,354
  • Thanks for your interesting answer. i have come across many of your answers and found them really interesting. However, I am struggling to understand every point you are making. When you say conditional mean, do you mean $E(X|Y)$ sort of thing that we try to find in standard regression? Also do you have any ressources that could help me better understand your points? For instance a quick technical intro on the fundamentals . Thanks again – lalaland Sep 09 '22 at 17:42
  • Yes, by the "conditional mean" I mean the expectation of your time series (or of whatever you are forecasting), conditional on the history (and any regressors or other information), essentially $E(Y_t|Y_1, \dots, Y_{t-1})$. In terms of resources, you might find my 2020 paper helpful, it's not long. Feel free to ping me for it on ResearchGate. The threads on optimal forecasts for lognormally and gamma data illustrate the point. – Stephan Kolassa Sep 09 '22 at 17:48
  • Thanks a lot Mr Kolassa – lalaland Sep 09 '22 at 17:56