3

I understand that given a standard normal variable $Z$ and a $\chi^2$ random variable $V$ with $\upsilon$ degrees of freedom that \begin{align*} T := \frac{Z}{\sqrt{V/\upsilon}} \end{align*} follows a $t$-distribution with $\upsilon$ degrees of freedom (assuming $Z$ and $V$ are independent). From this one may derive (source) the ubiquitous $t$-statistic \begin{align}\tag{1} \frac{\bar{X} - \mu_0}{S/\sqrt{n}} \sim t_{n-1} \end{align} from \begin{align*} Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}} \sim \mathcal{N}(0,1) \quad \text{and} \quad V = \frac{(n-1)S^2}{\sigma^2}\sim \chi^2_{n-1}. \end{align*}

However, when learning about hypothesis testing for regression population coefficients, I learned that given $\hat{\beta}$, an estimator of an arbitrary parameter $\beta$, one also uses the $t$-statistic \begin{align}\tag{2} t_{\hat{\beta}} = \frac{\hat{\beta} - \beta_0}{s.e.(\hat{\beta})}, \end{align} where $s.e.(\hat{\beta})$ is the standard error, i.e. the standard deviation of the sampling distribution of $\hat{\beta}$.

Whilst this looks similar to equation (1), I don't know why it should necessarily follow a $t$-distribution. I don't believe one would derive this fact in a similar manner to above: if our estimator $\hat{\beta}$ is not the sample mean then the random variable \begin{align*} \tilde{Z} = \frac{\hat{\beta} - \beta_0}{s.e.(\hat{\beta})} \end{align*} will not necessarily be standard normal (the central limit theorem can only be invoked for the sample mean).

Therefore my question is, does someone have a source or a proof to show that equation (2) follows a $t$-distribution given that $\hat{\beta}$ is an arbitrary parameter and not necessarily the sample mean? Everything online seems to focus on the specific case of the sample mean.

  • 3
    We have posts here that prove this for ordinary least squares regression. The proofs are all simple, deriving from an assumed Normal distribution of errors. The sample mean is a special case (it's OLS with only an intercept) but serves as a model for the general case. In other kinds of regression these statistics tend not to have Student t distributions but they often are asymptotically Normal. – whuber Jun 28 '22 at 12:47
  • 3
    @hamster230: My guess is that the point you might be missing is that the t-statistic of a coefficient in an OLS has a t-distribution under the assumption that the null hypothesis is TRUE. So this means that the true coefficient is $\beta$. In that case, one can show ( as whuber pointed out it's probably in many threads ) that the numerator is normal and the standard deviation is chi-squared and they are independent etc. If the null hypothesis is not true, then the distribution would be non-central t. – mlofton Jun 28 '22 at 13:32

1 Answers1

8

$\hat{\beta}$ is an arbitrary parameter and not necessarily the sample mean

The behaviour is the same as a sample mean.

  • The sample mean is a linear sum of the observations $y_i$

    $$\bar{y} = \sum_{i=1}^n a_{i} y_i \qquad \text{with} \, a_i = \frac{1}{n}$$

  • Ordinary least squares regression estimates the parameters also as a linear sum of the observations $y_i$

    $$\beta_j = \sum_{i=1}^n a_{ji}y_i \qquad \text{with} \, a_{ji} = \left[(X^TX)^{-1}X^T\right]_{ji}$$

So when these observations $y_i$ are normal distributed then $\beta_j$ will also be normal distributed just like $\bar{y}$ is considered to be normal distributed. (or if the $y_i$ are not approximately normal distributions then we might often still consider the linear sum as approximately normal distributed)


A geometrical viewpoint of OLS is that you are splitting the n-dimensional space of observations into two orthogonal sub-spaces as illustrated in the image below (from the question Why are the residuals in $\mathbb{R}^{n-p}$? ).

The error distribution of the observations can then also be considered to $y_i$ split into two parts. One part for the estimate(s) and one part for the residuals.

illustration for a small sample size

In the case of the estimate of the mean this sub-space will be a single diagonal line.

Geometrical sketch