1

If the Root Mean Squared Error is larger than the standard deviation of my dependent variable, does that mean that my model is inaccurate? What would be a good threshold for the proportion RMSE and SD for a model to be considered fairly accurate?

1 Answers1

0
  1. I struggle to think of an example of a predictive situation where an "accurate" model yields and RMSE that is larger than the SD. So it does seem like having RMSE>SD does indicate that your model is problematic. At least, an overall in-sample mean should yield a better one - presumably, because:

    If your data-generating process changes between training and evaluation on new data, and "SD" refers to the in-sample SD, then you might actually have a meaningful model, although the RMSE as assessed on the evaluation data, may be larger than the (in-sample) SD. For example, you may be predicting a time series with increasing variance.

  2. There is no simple rule to get from an error (whether measured as a MAPE, or as an RMSE scaled by the SD, as in your proposal) to a statement whether a forecast is "accurate" or not, see here: Is there any standard / criteria of good forecast measured by SMAPE and MASE? Indeed, if you are predicting IID data, you by definition can't do better than the in-sample mean, your RMSE will be equal to the SD (up to whether you divide by $n$ or $n-1$ or $n-k$), and this is the limit to your predictability. And per above, if you are predicting noise with increasing variance, your best RMSE will likely be larger than the (in-sample) SD.

Stephan Kolassa
  • 123,354