How to understand the homoscedasticity and heteroscedasticity in context of regression models?
Is there a way to check these properties in R?
- 1,307
1 Answers
In R when you fit a regression or glm (though GLMs are themselves typically heteroskedastic), you can check the model's variance assumption by plotting the model fit.
That is, when you fit the model you normally put it into a variable from which you can then call summary on it to get the usual regression table for the coefficients. If you plot the same variable you get some diagnostic plots.
For example, consider:
carmdl <- lm(dist~speed,cars)
plot(carmdl)
The third of the default plots that it produces is the scale-location plot:

[Other common choices for the y-axis in such a plot are the absolute residual and the log of the squared residual.]
That's a basic visual diagnostic of the spread of standardized (for model-variance) residuals against fitted values, which is suitable for seeing if there's variability related to the mean (not already accounted for by the model). If the assumption of homoskedasticity is true, we should see roughly constant spread. In this case the indication of increase with fitted values is fairly mild.
A common form of heteroskedasticity to look for would be where there's an increase in spread against fitted values. That would show as an increasing trend in the plot above. It can also be formally tested by the Breusch-Pagan test (though formal hypothesis tests of model assumptions aren't necessarily the best choice).
There are other forms of heteroskedasticity that are possible, but that's the most common one to check for. For example, if changing spread against a particular predictor was expected, that would suggest plotting the residual spread measure above against that predictor.
- 282,281
arch.testpackage in R, which implements Engle's ARCH (Autoregressive conditional heteroskedasticity) test. – Aksakal Apr 13 '14 at 16:06