4

I've been reading a few posts from distinguished members of this community about R^2 and time series forecasting:

1.What is the problem with using R-squared in time series models?

2.R-squared to compare forecasting techniques

However, I am wondering if the problems with using R^2 for time series model forecasting could be mitigated by splitting a TS into a training and testing set, and then computing the R^2 between the forecasted value and the true value. Would the problem with this method be that as you forecast further and further out, the correlation (of which R^2 is one measure of) between the forecasted and true values will be decreasing, so its not a great measure of the percent of variation explained? Or is there a better reason why this is not an advisable metric for TS performance?

1 Answers1

2

Rather than criticize R**2 which has been done well elsewhere , the real question is how to measure the adequacy of a model given (N) observations in total for a particular withheld set length (K=5) horizons based upon (W) withheld values ..the prediction interval. For example if we have N=120 historical values and are concerned with an accuracy measure for predicting the next 12 values , we could launch forecasts (W=12) say from period 60 and use the actual observations for period 61-72 to get a measure . We could then do this from period 72 , 84, 96 and 108 yielding 5 estimates of a 12 period out forecast error from 5 origins. For each of the 5 , we would have developed the best model based upon the history of the model type and it's associated parameters yielding a measure of it's OUT-OF-SAMPLE performance such as MAPE or WMAPE.

It is clear that the measure of model performance will depend on the # of future values to be forecasted. If one selected say a forecast length of 6 rather than 12 , a different model type might be elected as the "best" . It is important to set the forecast horizon correctly.

IrishStat
  • 29,661
  • 2
    Thank you for your answer. I am aware that using MAPE or other forecasting metric is an appropriate metric for forecasting. However I am trying to find a reason why out of sample R^2 is inappropriate as the correlation between the true and the forecasted values seem a (though perhaps not the best) metric for how accurate the forecasts are. – Anonymous Emu Oct 19 '19 at 03:21
  • 2
    The rsq approach essentially uses the residuals between the actual and the forecast WITHOUT taking into account the relative impact of the errors vis-a-vis the actual. The MAPE/SMAPE approach penalizes large relative errors to the actual and is then comparable to other data sets whereas the rsq is not. – IrishStat Oct 19 '19 at 14:28