When I am comparing RMSFE between a log model and a level model of the same dataset, how should I proceed?

Question

I have created an AR(2) and AR(5) model to forecast my data in stata. The AR(2) model is a level model, while the AR(5) model is a log model. I have computed the RMSFE to compare the forecasting accuracy between the two models, but given that the AR(5) model is a log model, the RMSFE is much lower than the one for the AR(2) model. I tried to generate the forecasting errors for the level model by taking the log of the forecasted value from the log of the actual value and then finding the RMSFE from there, in order to have same scale summary statistics between the models to compare. However, I don't believe this is the correct method.

I would appreciate any help or insight

score 1 · Answer 1 · answered Dec 12 '23 at 07:14

It is much easier to keep the AR(2) forecast, which is already on the original scale, and transform the AR(5) forecast from the log scale to the original scale. However, simply taking the exponential will yield a bias. Here is the bias correction, where you have to use the formula for $\lambda=0$.

That said, AR(5) is quite a complex model, and I have my doubts whether such a high AR order is warranted: Why does default auto.arima stop at (5,2,5)? I would very much recommend you don't try to roll your own ARIMA model, but rather rely on proven automatic ARIMA model selection tools, which will also automatically do a Box-Cox transformation (of which your log transform is a special case) and do the bias correction for the forecasts: Selecting ARIMA orders by ACF/PACF vs. by information criteria. Note that Resources/books for project on forecasting models contains pointers to literature; the FPP2 and FPP3 online books explicitly work with R forecasting packages and are very good.

When I am comparing RMSFE between a log model and a level model of the same dataset, how should I proceed?

1 Answers1