OLS: do we test the residuals for normality because then the error terms can be assumed normal, too? Is there proof for this?

Question

There are lots of resources out there that mix up residuals with errors, using the terms interchangeably, or saying "residual errors", or not acknowledging the existence of errors at all. (One example here.) In this post on Cross Validated, one comment under the accepted answer says:

After all, normality tests are performed on residuals to gauge whether the assumption of normally distributed errors is reasonable; normality of errors will lead to normality of residuals.

My questions are:

Is this so, and why do we assume this?
Since, as I understand it, the point of the errors is that they are random and unknown (the "noise"), how can we assume anything about them?

@user2974951, what is the link between the (known) residuals and the unknown errors? — Reader 123, Dec 14 '23 at 11:01
Some people argue that normality testing is essentially useless; at least keep in mind nothing is truly normally distributed, and methods based on the normality assumption will work well for many other distributions (though not all). https://stats.stackexchange.com/questions/2492/is-normality-testing-essentially-useless https://stats.stackexchange.com/questions/579728/why-is-my-data-not-normally-distributed-while-i-have-an-almost-perfect-qq-plot-a/579745#579745 https://stats.stackexchange.com/questions/538561/relevance-of-assumption-of-normality-ways-to-check-and-reading-recommendations — Christian Hennig, Dec 14 '23 at 11:24
@user2974951 Not quite. If the errors are actually normal, we don't need to perform asymptotic inference - the distribution of the $\hat{\beta}$ is exactly normal at any sample size, and the F tests are exact. On the other hand, your errors could nearly be any distribution (Bernoulli, exponential, ...) but with a large enough $n$ the inference is approximately correct because of the CLT. — AdamO, Dec 14 '23 at 13:20
The OLS can do many things - inference on a parameter, prediction, forecasting, .... Each of these has different considerations for how the error term is distributed, and its impact on estimation and inference. What's your application? — AdamO, Dec 14 '23 at 13:24

Christian Hennig · Answer 1 · 2023-12-14T13:11:21.837

The comment you cite is very imprecise. If errors are normal and the model is correctly specified and estimated by least squares, residuals will be conditionally normally distributed (see comment by @BigBendRegion).

Normality testing of residuals is problematic anyway as nothing in reality is really normal and precise normality is not required. Particularly with large samples normality tests will (correctly) reject normality, but the regression may still be fine (but then it may not). For a discussion see Is normality testing essentially useless?, Relevance of assumption of normality, ways to check, and recommendations.

"How can we assume anything about the errors?" Model assumptions in the first place are tools for thinking, that enable us, for example, to make quantitative statements about uncertainty. We generally need to make some model assumptions that cannot be directly verified in order to do statistical analyses. The models are idealised situations and we choose our analyses so that they are guaranteed to work well in the idealised model-world. This does not necessarily guarantee us anything for reality, however it is a starting point for investigating it. As often model assumptions are connected to expected visible patterns in the data (such as normal residuals), we can make statements from the observations to what extent certain model assumptions are compatible with the data, without being able to verify them. Note however that "model assumption X is compatible with the data" is not quite the same as "method Y based on model assumption X will work well for these data", and usually we are interested in the latter rather than the former.

You mean "residuals will be conditionally normally distributed." The fact that they have nonconstant variance, despite homoscedastic errors, implies that they are marginally non-normal. They are instead marginally distributed as a mixture of normals. — BigBendRegion, Dec 14 '23 at 12:15
(+1) Part of the problem is linguistic, over what assumption means. In logic and pure mathematics, the failure of an assumption invalidates an argument. In applied mathematics, which here means statistics, just about every model is an approximation of some kind and so-called assumptions specify ideal conditions which at one end allows theorems to be deduced but at the other end are never strictly true of real data. Even in contexts where something is normally distributed is an ideal condition, that is usually the least important such ideal condition. — Nick Cox, Dec 14 '23 at 12:29

OLS: do we test the residuals for normality *because* then the error terms can be assumed normal, too? Is there proof for this?

1 Answers1

OLS: do we test the residuals for normality because then the error terms can be assumed normal, too? Is there proof for this?