Why does t-statistic for arbitrary parameter follow t distribution?

Question

I understand that given a standard normal variable $Z$ and a $\chi^2$ random variable $V$ with $\upsilon$ degrees of freedom that \begin{align*} T := \frac{Z}{\sqrt{V/\upsilon}} \end{align*} follows a $t$-distribution with $\upsilon$ degrees of freedom (assuming $Z$ and $V$ are independent). From this one may derive (source) the ubiquitous $t$-statistic \begin{align}\tag{1} \frac{\bar{X} - \mu_0}{S/\sqrt{n}} \sim t_{n-1} \end{align} from \begin{align*} Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}} \sim \mathcal{N}(0,1) \quad \text{and} \quad V = \frac{(n-1)S^2}{\sigma^2}\sim \chi^2_{n-1}. \end{align*}

However, when learning about hypothesis testing for regression population coefficients, I learned that given $\hat{\beta}$, an estimator of an arbitrary parameter $\beta$, one also uses the $t$-statistic \begin{align}\tag{2} t_{\hat{\beta}} = \frac{\hat{\beta} - \beta_0}{s.e.(\hat{\beta})}, \end{align} where $s.e.(\hat{\beta})$ is the standard error, i.e. the standard deviation of the sampling distribution of $\hat{\beta}$.

Whilst this looks similar to equation (1), I don't know why it should necessarily follow a $t$-distribution. I don't believe one would derive this fact in a similar manner to above: if our estimator $\hat{\beta}$ is not the sample mean then the random variable \begin{align*} \tilde{Z} = \frac{\hat{\beta} - \beta_0}{s.e.(\hat{\beta})} \end{align*} will not necessarily be standard normal (the central limit theorem can only be invoked for the sample mean).

Therefore my question is, does someone have a source or a proof to show that equation (2) follows a $t$-distribution given that $\hat{\beta}$ is an arbitrary parameter and not necessarily the sample mean? Everything online seems to focus on the specific case of the sample mean.

We have posts here that prove this for ordinary least squares regression. The proofs are all simple, deriving from an assumed Normal distribution of errors. The sample mean is a special case (it's OLS with only an intercept) but serves as a model for the general case. In other kinds of regression these statistics tend not to have Student t distributions but they often are asymptotically Normal. — whuber, Jun 28 '22 at 12:47
@hamster230: My guess is that the point you might be missing is that the t-statistic of a coefficient in an OLS has a t-distribution under the assumption that the null hypothesis is TRUE. So this means that the true coefficient is $\beta$. In that case, one can show ( as whuber pointed out it's probably in many threads ) that the numerator is normal and the standard deviation is chi-squared and they are independent etc. If the null hypothesis is not true, then the distribution would be non-central t. — mlofton, Jun 28 '22 at 13:32

Sextus Empiricus · Accepted Answer · 2022-06-28T15:29:52.693

$\hat{\beta}$ is an arbitrary parameter and not necessarily the sample mean

The behaviour is the same as a sample mean.

The sample mean is a linear sum of the observations $y_i$

$$\bar{y} = \sum_{i=1}^n a_{i} y_i \qquad \text{with} \, a_i = \frac{1}{n}$$
Ordinary least squares regression estimates the parameters also as a linear sum of the observations $y_i$

$$\beta_j = \sum_{i=1}^n a_{ji}y_i \qquad \text{with} \, a_{ji} = \left[(X^TX)^{-1}X^T\right]_{ji}$$

So when these observations $y_i$ are normal distributed then $\beta_j$ will also be normal distributed just like $\bar{y}$ is considered to be normal distributed. (or if the $y_i$ are not approximately normal distributions then we might often still consider the linear sum as approximately normal distributed)

A geometrical viewpoint of OLS is that you are splitting the n-dimensional space of observations into two orthogonal sub-spaces as illustrated in the image below (from the question Why are the residuals in $\mathbb{R}^{n-p}$? ).

The error distribution of the observations can then also be considered to $y_i$ split into two parts. One part for the estimate(s) and one part for the residuals.

In the case of the estimate of the mean this sub-space will be a single diagonal line.

Why does t-statistic for arbitrary parameter follow t distribution?

1 Answers1

Linked

Related