How do we know if the RMSE values are reliable?

Question

The root-mean-squared error(RMSE) values should be close to zero. Although the optimal value of RMSE is known as 0, we can say that it is low when RMSE =10 for the data set with high average (let’s say 300-1000 in average). However, in a data set with a low average (such as 5, 7 etc), the value 2 may be low. How can an acceptable level of RMSE value be determined in this case? For example, is comparing the standard deviation(SD) with the RMSE using the right approach? Can an approach such as the following be proposed?:

• SD/RMSE <0.5 acceptable ???
• SD/RMSE>0.5 is not acceptable???

I am curious about your opinion on this topic.

"...in some datasets" in which datasets? Can you be more specific? — utobi, Dec 04 '22 at 16:28
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. — Community, Dec 04 '22 at 16:28
Why "should" the RMSE be close to zero? Do you mean that lower RMSE are preferable, or do you believe there is some specific low RMSE that can be reached? Also, standard deviation of what, divided by RMSE of what? Perhaps this thread is helpful. — Stephan Kolassa, Dec 04 '22 at 16:33
The answer to this question is going to vary case by case. Here are some general comments: (1) in a prediction setting RMSE is often useful for comparing two or more methods. In this way, the raw RMSE value doesn't really matter. (2) We typically do NOT recommend any sort of dichotomization as a "general rule of thumb". In particular cases, a binary rule may be acceptable, but it is probably inadvisable. See this article for an argument against dichotomization. — knrumsey, Dec 04 '22 at 16:57

score 1 · Answer 1 · answered Dec 04 '22 at 17:00

This is a place where an $R^2$-style metric could be useful.

$$ R^2=1-\dfrac{ \sum(y_i-\hat y_i)^2 }{\sum (y_i-\bar y)^2 }=1-\dfrac{ (RMSE)^2 }{ \text{var}(y) } $$

There is a sense in which this measures the size of the errors of your model relative to the size of the errors of a model that always predicts $\bar y$, regardless of the feature values, this strikes me as a good baseline: in the absence of knowing much, a reasonable guess for the conditional expected value is the pooled/marginal expected value.

You measure the size of your model errors using the variance of them (square root for RMSE) and the size of your baseline errors by taking the variance of $y$.

A natural follow-up is to ask how large of a value of this statistic qualifies as a good model. As usual, there are no hard rules except that $1$ indicates perfection. However, this could be a useful standardization.

At the same time, your work has natural units, and you should have some sense of how your errors look in the context of that work (e.g., being off by a few dollars might be a big deal for predicting grocery bills but less of a big deal for predicting house prices).

How do we know if the RMSE values are reliable?

1 Answers1

Linked