6

I always think about the error term in a linear regression model as a random variable, with some distribution and a variance. So if the error terms come from this random variable, why do we say that they have a constant variance?

kanbhold
  • 865

1 Answers1

3

The error term ($\epsilon_i$) is indeed a random variable. The normality assumption holds if it has Normal distribution - $\epsilon_i$ ~ $N(\mu,\sigma)$. You are right when you say:

I always think about the error term in a linear regression model as a random variable, with some distribution and a variance

The assumption of constant variance (aka homoscedasticity) holds if the dispersion of the residuals is homogeneous along the range of values in $X$ or $Y$. This pattern of dispersion can vary.

So if the error terms come from this random variable, why do we say that they have a constant variance?

One error observation alone does not have variance. The variances come from subsets of groups of error observations. For a better comprehension, look into this picture, borrowed from @caracal's answer here.

enter image description here

It also helps looking to some plots which illustrates the opposite of homoscedasticity (non constant variance).

Andre Silva
  • 3,080
  • This answer appears to miss the point: how can the $\epsilon_i$ be considered "a" random variable when they are heteroscedastic? – whuber Feb 16 '14 at 22:16
  • 2
    "One error observation alone does not have variance." True, but it can be thought of as having been drawn from a distribution that does have a variance. – gung - Reinstate Monica Feb 17 '14 at 14:59