1

Let $Y=X\beta+\epsilon$ be a linear model with all the usual assumptions ($X$ is fixed). Let $Y_2=X_2\beta+\epsilon_2$ for a "test set" $(X_2,Y_2)$. Then let $\hat\beta=(X^TX)^{-1}X^TY$ so the residuals of the predicted values are $$X_2\hat\beta-Y_2=X_2(X^TX)^{-1}X^T\epsilon-\epsilon_2$$ which is $\mathcal N(0,\sigma^2(X_2(X^TX)^{-1}X_2^T+I))$.

What's the distribution of the mean squared prediction error?

Akababa
  • 161

0 Answers0