Why isn't the residual standard error referred to as RMSE?

Question

As I understand it, in the specific context of linear regression, the R output "residual standard error" is an estimate of $\sigma$, the standard deviation of the distribution of the residuals. You can compute $\sigma^2$ by the MSE, which in general is the mean of the squared errors, but in regression the denominator is the residual degrees of freedom $n-p$ ($p$ representing the number of parameters including the intercept).

$$MSE=\frac{\sum_{i=1}^{n} y_i - \hat{y}_i}{n-p}$$

Why isn't residual standard error just called RMSE, when $\sigma^2$ is the MSE? Or is $\hat{\sigma}^2$ often called the residual variance (making the terms for $\sigma^2$ and $\sigma$ line up)?

I think this is widely regarded as a poor decision by the R developers. I’ve definitely seen discussion about it on here but can’t think of where. — Dave, Mar 09 '23 at 19:38
@Dave how would you call $\hat{\sigma}$ and $\hat{\sigma}^2$? — fmtcs, Mar 09 '23 at 19:47
Unbiased error variance estimate and the square root of the unbiased error variance estimate. (I might use “unbiased standard deviation” as slang, even though Jensen’s inequality shows such a term to be wrong (square root of an unbiased estimator is biased).) — Dave, Mar 09 '23 at 19:50
@Dave do you think calling $\hat{\sigma}^2$ the MSE is a mistake as well? — fmtcs, Mar 09 '23 at 19:53
Referring to “the” MSE is probably a mistake, since there are reasonable arguments for multiple calculations (an $n$ denominator and and $n-p$ denominator both make sense). I would want to define explicitly what I mean if there is any ambiguity about it. However, the R decision to call $\hat\sigma$ the standard error makes no sense to me, because standard errors are associated with parameters being estimated. To which parameter does $\hat\sigma$ correspond? (I don’t have an answer, which is why I have yet to see why R calls $\sigma$ a residual standard error.) — Dave, Mar 09 '23 at 19:57

Dave · Accepted Answer · 2023-04-20T19:28:33.007

Referring to “the” MSE is probably a mistake, since there are reasonable arguments for multiple calculations (an $n$ denominator and and $n−p$ denominator both make sense). I would want to define explicitly what I mean if there is any ambiguity about it.

However, the R decision to call $\hat\sigma$ the standard error makes no sense to me, because standard errors are associated with parameters being estimated. To which parameter does $\hat\sigma$ correspond? (I don’t have an answer, which is why I have yet to see why R calls $\hat\sigma$ a residual standard error.)

I do not really have a clean name for $\hat\sigma=\sqrt{ \frac{ \sum\left( y_i-\hat y_i \right)^2 }{ n-p} }$. While the expression inside the square root is unbiased for error variance (assuming fairly typical assumptions like the Gauss-Markov conditions), Jensen’s inequality means that $\hat\sigma$ is biased for the error standard deviation, so “unbiased error standard deviation” is not correct.

EDIT

The R developers appear to know that residual standard "error" is a misnomer and lament that it crept into the documentation for functions to such an extent that correcting this becomes difficult.

I recall seeing others on here lamenting this naming decision but cannot think offhand of where to find such posts. I welcome others to post links in the comments. — Dave, Mar 18 '23 at 13:16

Why isn't the residual standard error referred to as RMSE?

1 Answers1