Are linear regression errors independent? Mean independent? Uncorrelated?

Question

All I know is that we assume zero conditional mean (and hence zero mean) and conditional homoscedasticity (and hence homoscedasticity).

When trying to prove that $E[(\hat{\beta_1} - \beta_1)\bar{u}] = 0$, where $\beta_1$ is the slope in the linear regression model, $\hat{\beta_1}$ is its estimate and $\bar{u}$ is the average of the errors in the linear regression model (not the residuals!), I encountered:

$$E[(\hat{\beta_1} - \beta_1)\bar{u}|x]$$

$$\vdots$$

$$ = \frac{1}{n}\sum_{i=1}^{n} \frac{(x_i - \bar{x})}{SST_x} \color{red}{[\sum_{j=1}^{n} E[(u_i)u_j|x]]}$$

$$ = \frac{1}{n}\sum_{i=1}^{n} \frac{(x_i - \bar{x})}{SST_x} \color{red}{\sigma^2}$$

$$\vdots$$

$$ = 0 $$

$$\to E[(\hat{\beta_1} - \beta_1)\bar{u}] = 0$$

QED

What is the justification for that part? I tried:

For $i \ne j$, we have $E[(u_i)u_j|x] = Cov[u_i,u_j|x] + E[(u_i)|x]E[u_j|x] \stackrel{(*)}{=} 0 + (0)(0) = 0$

For $i = j$, we have $E[(u_i)u_j|x] = E[(u_i^2)|x] = Var[u_i|x] = \sigma^2$

Is $(*)$ right?

If so, what is the justification?

If not, how does one show that $E[u_i u_j | x] = 0$?

From Wooldridge:

This is from $(ii)$ of this exercise:

@dsaxton How do you know? It doesn't seem to be part of the assumptions of SLR — BCLC, Mar 19 '16 at 15:14
Scroll down to assumptions: https://en.wikipedia.org/wiki/Linear_regression. If you didn't assume the errors were uncorrelated then how else would you conclude this? You can easily imagine a model satisfying all the other conditions where the errors are correlated. — dsaxton, Mar 19 '16 at 15:19
@dsaxton I know it's uncorrelated, mean independent or independent. I wanted to know which specifically it was. 'Independence of errors. This assumes that the errors of the response variables are uncorrelated with each other.' Should I find it strange that the book doesn't include that? — BCLC, Mar 19 '16 at 15:21
Sometimes independence is stated as an assumption, but lack of correlation should always be. You can find it here as well: https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_theorem. If you already knew this I'm not sure why you'd ask for help in showing $\text{E}(u_i u_j) = 0$. — dsaxton, Mar 19 '16 at 15:29
@dsaxton Oh sorry I mean I knew it was uncorrelated or something else based on what I read before, but this particular book doesn't seem to have anything. Might there be a way of deducing uncorrelatedness from another assumption here? — BCLC, Mar 19 '16 at 17:39
As I commented before, uncorrelatedness doesn't follow from the other assumptions, otherwise it wouldn't need to be stated as an additional assumption as it is. — dsaxton, Mar 19 '16 at 23:10
@BCLC independence follows from the random sampling assumption, see answer below — Carlos Cinelli, Nov 22 '17 at 23:35

Carlos Cinelli · Accepted Answer · 2017-11-25T05:55:07.080

3

The key thing here is that Wooldridge makes the assumption of random sampling.

Notice that since we have a random sample this means $(x_i, y_i) \perp (x_j, y_j)$ for $i \neq j$ which means that the single components of the pair are also independent, in particular $x_i \perp x_j$ (to see that just notice the joint is $p(x_i, y_i, x_j, y_j) = p(x_i, y_i)p (x_j, y_j)$ and marginalize over $y$).

This further implies $y_i \perp y_j |x_i, x_j$, since:

$$ p(y_i, y_j|x_i, x_j) = \frac{p(x_i, y_i, x_j, y_j)}{p(x_i, x_j)} = \frac{p(x_i, y_i)p (x_j, y_j)}{p(x_i)p(x_j)} = p(y_i|x_i)p(y_j|x_j) $$

edited Nov 25 '17 at 05:55

answered Nov 22 '17 at 23:27

Carlos Cinelli

12,552

Oh I think I get it we just write $u_m=y_m-b0-b1x_m$ and then apply all that you said to show the penultimate equation you have? Also are introductory econometric students expected to get this? – BCLC Nov 25 '17 at 05:31
1

That’s one way to think about it. Regarding introductory students, not really, my experience is that people usually memorize the error terms are independent/uncorrelated with random sampling. – Carlos Cinelli Nov 25 '17 at 05:54
Actually, you have to be a little more precise about what "random sample" actually means. In finite populations, elements of simple random samples are not independent. So you really do have to assume independence, unless "random sampling" has previously been defined as "independent sampling." – BigBendRegion Oct 18 '20 at 17:02
@CarlosCinelli, could you clarify the notation in the last paragraph? Is the x without a subscript a vector, i.e., x = (x_i, x_j)? And if other covariates were included, say, z_i, would it be appropriate to write E(u_i, u_j | x_i, x_j, z_i, z_j) = 0? – hendogg87 Jun 11 '21 at 08:27

Are linear regression errors independent? Mean independent? Uncorrelated?

1 Answers1

Linked