Squared internally studentized residual over $n-p$ is Beta distributed

Question

Assume a regression model $y = X \beta + \varepsilon$ with $n$ observations and $p$ parameters. Let $r_i$ be the $i$-th internally studentized residual: $$r_i = \frac{e_i}{\sqrt{\hat{\sigma} (1 -h_{ii}) } } ,$$ where $h_{ii}$ is the entry $i,i$ of the matrix of projection $H= X(X'X)^{-1}X'$ to the column space of $X$, $e_i = y_i-\hat{y_i}=y_i - x_i'\hat{\beta}$ and $\hat{\sigma}$ is the unbiased estimator of $\sigma^2$.

I need to prove that $$\frac{r_i^2}{n-p} = \text{Beta}\left( \frac{1}{2}, \frac{n-p-1}{2} \right). $$

What I have tried is using that $$ e_i = \frac{r_i}{\sqrt{\sigma^2(1-h_{ii})}} \sim N(0,1)$$ Then $U=e_i^2$ is a $\chi^2_1$ distribution. Also we know that $$V=\frac{(n-p-1)\hat{\sigma} }{\sigma^2}\sim \chi^2_{n-p-1}.$$ With this two distributions I tried to use the identity of distributions from here $$\frac{U}{U+V} \sim \text{Beta}\left( \frac{1}{2}, \frac{n-p-1}{2} \right) .$$

But

$$ \frac{U}{U+V} = \frac{r_i^2}{r_i + (n-p-1)\hat{\sigma}(1-h_{ii})}.$$

What am I doing wrong? Any hint or solution to my problem?

You're using the symbol "$X$" to represent two completely different things in this post. This is confusing. I'd suggest you replace the random variables you're calling X and Y in your last equation with U and V. — Glen_b, Mar 29 '19 at 23:57

GeoMatt22 · Accepted Answer · 2019-07-10T03:31:34.687

I recently had to solve this problem, and hopefully my response can shed some light. (See also here.)

The thing that finally made it click for me was to split the problem into two parts: linear algebra vs. probability. $ \newcommand{\mat}[1]{\boldsymbol{#1}} \newcommand{\norm}[1]{\lVert{#1}\rVert} \newcommand{\rank}[1]{\operatorname{rank}\left[#1\right]} $

Linear Algebra

To begin, consider the purely deterministic least squares problem for a given $\mat{y} \in \mathbb{R}^n$ and $\mat{X} \in \mathbb{R}^{n\times{p}}$, whose solution $$ \norm{\mat{e}}^2=\min_{\mat{\beta}}\norm{\mat{y}-\mat{X}\mat{\beta}}^2 $$ corresponds to residual vector $\mat{e}$.

Partitioning the inputs as $$ \mat{y} = \begin{bmatrix}\mat{y}_{(i)}\\y_i\end{bmatrix} , \mat{X} = \begin{bmatrix}\mat{X}_{(i)}\\\mat{x}_i^T\end{bmatrix} $$ and solving the related leave one out problem $$ \norm{\mat{e}_{(i)}}^2=\min_{\mat{\beta}}\norm{\mat{y}_{(i)}-\mat{X}_{(i)}\mat{\beta}}^2 $$ gives a residual vector $\mat{e}_{(i)}$, corresponding to the original problem with point $i$ held out.

The key linear algebraic result is that the two answers are related by

$$\boxed{ \norm{\mat{e}}^2 = \norm{\mat{e}_{(i)}}^2 + \frac{e_i^2}{1-h_i} }$$

where $h_i$ is the leverage of the held out point $i$.

(My derivation was not pretty, and is omitted here for the sake of brevity.)

Before turning to the stochastic component, a final linear algebra note is that the residual can be expressed as $ \mat{e}=\left(\mat{I}-\mat{H}\right)\mat{y}=\mat{G}\mat{y} $, where the hat matrix $\mat{H}$ is purely a function of $\mat{X}$, and $$\rank{\mat{H}}+\rank{\mat{G}}=p+m=n$$ This means that for a given $\mat{X}$, the residual $\mat{e}$ has only $m=n-p$ independent components (residual degrees of freedom), as $\mat{y}$ varies over $\mathbb{R}^n$, and $\mat{e}$ varies over the null space of $\mat{X}$. Similarly, if we hold $y_i$ fixed, then $\mat{e}_{(i)}$ has $m-1$ independently varying components as $\mat{y}_{(i)}$ varies over $\mathbb{R}^{n-1}$. Alternatively, we can independently vary $y_i$ over $\mathbb{R}$ while leaving $\mat{e}_{(i)}$ unchanged.

(Formally, $\mat{H}$ and $\mat{G}$ are the components of the outer product of the $\mat{Q}$ factor in the QR decomposition of $\mat{X}$.)

Probability

Now if our right hand side vector is actually $$ \mat{y} = \mat{X}\mat{\tilde{\beta}} + \mat{\epsilon} , \mat{\epsilon} \sim \mathcal{N}_{\mat{0}, \mat{I}\sigma^2} $$ then as $\mat{\epsilon}$ varies randomly over $\mathbb{R}^n$, the particular data $\mat{y}$ and residuals $\mat{e}$, $\mat{e}_{(i)}$ will also vary, with corresponding Gaussian probability densities.

In this case we have $$ \begin{aligned} \hat{\sigma}^2 m &\equiv \norm{\mat{e}}^2 &&\sim \chi^2_{m} \\ \hat{\sigma}^2_{(i)}\left(m-1\right) &\equiv \norm{\mat{e}_{(i)}}^2 &&\sim \chi^2_{m-1} \\ \rho_i^2 &\equiv \frac{e_i^2}{1-h_i} &&\sim \chi^2_1 \end{aligned} $$ where $\hat{\sigma}^2_{(i)}$ and $\rho_i^2$ are independent.

Then the squared internally studentized residual corresponding to $e_i$ is $$ r_i^2 \equiv \frac{\rho_i^2}{\hat{\sigma}^2} = \frac{\rho_i^2}{\norm{\mat{e}}^2/m} \implies \boxed{ \frac{r_i^2}{m} = \frac{\rho_i^2}{\norm{\mat{e}_{(i)}}^2 + \rho_i^2} } \sim \operatorname{Beta}_{\frac{1}{2}, \frac{m-1}{2}} $$ where the last step follows from the properties of the Beta distribution.

Similarly, for completeness it is worth noting that the corresponding externally studentized residual is given by $$ t_i^2 \equiv \frac{\rho_i^2}{\hat{\sigma}^2_{(i)}} = \frac{\rho_i^2/\left(1\right)}{\norm{\mat{e}_{(i)}}^2/\left(m-1\right)} \sim \operatorname{F}_{1, m-1} $$ from the properties of the F distribution.

Squared internally studentized residual over $n-p$ is Beta distributed

1 Answers1

Linear Algebra

Probability

Linked