Regression distribution of Residuals - Chi square test

Question

I'm trying to solve a conundrum but can't figure out where I'm making the mistake.

Let me define the Linear model as $$ \begin{align} y &= X \beta + \epsilon \\ \end{align} $$

Where $X$ is $N \times k$. Now we define our OLS estimates as $$ \begin{align} \hat{\beta}^{OLS} &= ( X^{T} X)^{-1}X^{T}y \\\\ \hat{y}^{OLS} &= X\hat{\beta}^{OLS} = X( X^{T} X)^{-1}X^{T}y \\\\ \hat{\epsilon}^{OLS} &= y - \hat{y}^{OLS} = (I - X( X^{T} X)^{-1}X^{T})y \end{align} $$ defining $P$ as the projection matrix $$ P = X( X^{T} X)^{-1}X^{T} $$ and $M$ as the annihilator matrix $$ M = I - P $$

I can write $$ \hat{\epsilon}^{OLS} = M y = M (X \beta + \epsilon ) = M \epsilon $$

Assuming normal distribution of the residual terms

$$ \epsilon \sim\mathcal{N} ( 0, \sigma^{2}I ) $$

This would give us $$ \epsilon^T \left( Var(\epsilon) \right)^{-1} \epsilon = \frac{1}{\sigma^2} \epsilon^T \epsilon \sim \chi^2 (N) $$

We should also get $$ \hat{\epsilon}^{OLS} \sim \mathcal{N} ( 0, \sigma^{2} M^TM ) = \mathcal{N} ( 0, \sigma^{2} M ) $$

Now comes the confusion. We know that

$$ \frac{1}{\sigma^2} \left( \hat{\epsilon}^{OLS} \right)^T\left( \hat{\epsilon}^{OLS} \right) \sim \chi^2 (N-k) $$

We can see this without going into too much detail as

$$ \begin{align} \frac{1}{\sigma^2} \left( \hat{\epsilon}^{OLS} \right)^T\left( \hat{\epsilon}^{OLS} \right) &= \frac{1}{\sigma^2} \left( \epsilon^T M^T M \epsilon \right) \\\\ &= \frac{1}{\sigma^2} \left( \epsilon^T M \epsilon \right) \\\\ &\sim \chi^2(N-k) \end{align} $$ To get to the last step, you can show that $M$ has a $rank$ of $N-k$.

More detail if required can be found here (Why is RSS distributed chi square times n-p?)

However If something is normally distributed with a mean of 0 then

$$ \begin{align} \left( \hat{\epsilon}^{OLS} \right)^T \left( Var(\hat{\epsilon}^{OLS}) \right)^{-1}\left( \hat{\epsilon}^{OLS} \right) \sim \chi^2 (N) \end{align} $$

Investigating this I get the following $$ \begin{align} \left( \hat{\epsilon}^{OLS} \right)^T \left( Var(\hat{\epsilon}^{OLS}) \right)^{-1}\left( \hat{\epsilon}^{OLS} \right) &= \epsilon^{T}M^T\left(\sigma^2 M\right)^{-1}M\epsilon \\\\ &= \frac{1}{\sigma^2} \epsilon^{T}M^TM^{-1}M\epsilon \\\\ &= \frac{1}{\sigma^2} \epsilon^{T}M\epsilon \\\\ &\sim \chi^2(N-k) \end{align} $$

So now I'm not sure where I'm making the mistake. I get this contradiction in my notes. Any reference material would be very helpful.

The conclusion is that $\hat{\epsilon}^{OLS}$ is not i.i.d. Normal with mean zero (the "independent" is important here, although you left it out of the line beginning with However.) You have shown this via a proof-by-contradiction! In a similar, but simpler, way, we can see this easily by noticing that the residuals have to sum to zero (if there's a constant term in the regression), therefore they can't be independent. — jbowman, Jul 07 '23 at 19:50

Regression distribution of Residuals - Chi square test

0 Answers0