In LOOCV, in general, the model has to be fit n times (n being the number of observations). The test error is then estimated as a mean of all the individual test errors.
Hastie and Tibshirani mention in section 5.1.2 of their book that with least squares linear or polynomial regression, the cost of LOOCV reduces to that of a single model fit, given by
$CV_{(n)} = \frac{1}{n}\sum_{i=1}^n (\frac{y_i - \hat{y_i}}{1 - h_i})^2$
where $\hat{y_i}$ is the $i$th fitted value from the original least squares fit, and $h_i$ is the leverage statistic.
- Could you please explain how this result can be proved?
- Even after getting the estimate for the test error, how is the model chosen?