Derivation of the variance of the ${\hat{\beta}}$, why ${\frac{\hat{\beta} - \beta}{SE(\hat{\beta})}}$ follows t-distribution

Question

Thank you so much for all the input and for pointing out my mistakes. I made some corrections and try to add more details but ended up trying to answer my problem based on other people's posts. Please let me know if there are any mistakes.

Why is a T distribution used for hypothesis testing a linear regression coefficient?

Proof that the coefficients in an OLS model follow a t-distribution with (n-k) degrees of freedom

I was trying to derive the variance of ${\hat{\beta} = (X^TX)^{-1}\sigma^2}$, I was not so sure about what is this ${\sigma^2}$ and how to estimate ${\sigma^2}$

Given, ${f(X) = X\beta + u}$, and minimizing the sum of square of residuals ${\Sigma(y_i - f(x_i))^2}$. Thus, the unique solution of ${\beta}$:

\begin{equation} {\hat{\beta} = (X^TX)^{-1}X^Ty} \end{equation}

Further, we have ${\hat{y} = X\hat{\beta} = X(X^TX)^{-1}X^Ty}$

I was reading about the derivation of the variance of ${\hat{\beta}}$:

\begin{equation} {y = X \beta + u} \end{equation}

\begin{equation} {\hat{\beta} = X(X^TX)^{-1}X^T(X \beta + u)} \end{equation}

\begin{equation} Var(\hat{\beta}) = E[\hat{\beta} - \beta)(\hat{\beta} - \beta)^T] \end{equation}

\begin{equation} Var(\hat{\beta}) = (X^TX)^{-1}X^TE[uu^T]X(X^TX)^{-1} \end{equation}

\begin{equation} Var(\hat{\beta}) = E[uu^T](X^TX)^{-1} = {\sigma}^2 (X^TX)^{-1} \end{equation}

Thus, \begin{equation} \frac{\hat{\beta_i} - \beta_i}{\sigma \sqrt{X^TX_{ii}}} \sim N(0,1) \end{equation}

In reality, ${s^2 = \frac{RSS}{n-p}}$ is an unbiased estimator for ${\sigma^2}$ (I got very confused between ${s}$ and ${\sigma}$)

\begin{equation} RSS = \frac{\sum(y_i - \hat{y_i})^2}{n-p} = (n-p)s^2 \sim \chi_{n-p}^2 \end{equation}

\begin{equation} \frac{(n-p)s^2}{\sigma^2} \sim \chi_{n-p}^2 \end{equation}

\begin{equation} \frac{s}{\sigma} \sim \sqrt{\frac{\chi_{n-p}^2}{(n-p)}} \end{equation}

A t-distribution is formed, where${N(0,1)}$-distribution divided by ${\sqrt{\chi^2(s)/s}}$-distribution

Note carefully that $u$ here is the error term, not the residual. Your $\hat{y}$ is undefined so far, but if you mean what I suspect you would mean by it then the equation at the end isn't true. — Glen_b, Oct 03 '22 at 06:57
Is $y_i=\beta_0+\beta_1x$, or is $y_i=\beta_0+\beta_1x+u_i?$ Then to what is the other equal? // In light of the comment, I have deleted my original answer until the notation can be clarified, as Glen is correct in pointing out that what you’ve written (and consequently what I had written) is not quite true. — Dave, Oct 03 '22 at 14:12
Sorry for the confusion, @Dave. I meant ${y_i = \beta_0 + \beta_1 x_i + u}$ and ${\hat{y_i} = \beta_0 + \beta_1 x_i}$. — symphony001, Oct 03 '22 at 15:33
That’s an extremely unusual notation, since $\hat y_i$ tends to be reserved for the predicted values. Do you mean to put hats on the beta coefficients, too? — Dave, Oct 03 '22 at 15:49
Thank you for the correction! ${y_i = \beta_0 + \beta_1 x_i + u}$ and ${\hat{y_i} = \hat{\beta_0 }+ \hat{\beta_1} x_i }$ — symphony001, Oct 03 '22 at 16:31
Thanks. Then what you’ve written is a false statement. I hope it is a typographical error or an oversight on your part in copying your textbook or class notes. What exactly does your reference say? Is it $\hat u\hat u^T$ in the expectation? — Dave, Oct 03 '22 at 16:37
I found the following two answers very helpful and sort of answered my own question as in the post: 1) https://stats.stackexchange.com/questions/286179/why-is-a-t-distribution-used-for-hypothesis-testing-a-linear-regression-coeffici. 2)https://stats.stackexchange.com/questions/117406/proof-that-the-coefficients-in-an-ols-model-follow-a-t-distribution-with-n-k-d. — symphony001, Oct 08 '22 at 19:19

Derivation of the variance of the ${\hat{\beta}}$, why ${\frac{\hat{\beta} - \beta}{SE(\hat{\beta})}}$ follows t-distribution

0 Answers0