When I read The Elements of statistical learning, in the section of "5.5 Automatic Selection of the Smoothing Parameters", equation (5.26) gives a LOOCV formula for corresbonding $\lambda$
Here the background is fitting a smoothing spline using the data $(y_i, x_i)\ i = 1,2,...,N$. In order to select a smoothing parameter, we use LOOCV creteria.
\begin{equation} \text{CV}(\hat{f}_{\lambda}) = \frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{f}_{\lambda}^{(-i)}(x_i))^2=\frac{1}{N}\sum_{i=1}^{N}\left(\frac{y_i - \hat{f}_{\lambda}(x_i)}{1-S_{\lambda}(i,i)}\right) \end{equation}
where $\hat{f}_{\lambda}^{(-i)}$ means fitting value by using model without using $(y_i, x_i)$, and $S_{\lambda}(i,i)$ is the diagonal element of the smoothing matrix $\mathbf{S}_{\lambda}$ based on whole data. ($\mathbf{S}_{\lambda} = (\mathbf{I} + \lambda \mathbf{K})^{-1} = \mathbf{N}(\mathbf{N}^T\mathbf{N} + \lambda \Omega)^{-1} \mathbf{N}^T)$.
(the notation is consistent with chapter 5 in the textbook The Elements of statistical learning)
I wonder how to prove the second equation, so I search online and find the following post is helpful. But when I try to prove the above formula for smoothing matrix, I've run into difficulties and it's hard to carry on.
I can only prove that
\begin{equation} (\mathbf{N}_{(t)}^T\mathbf{N}_{(t)} + \lambda \Omega)^{-1} \mathbf{N}_t^T = (\mathbf{N}^T\mathbf{N} + \lambda \Omega)^{-1} \mathbf{N}_t^T \left(\frac{1}{1-S_{\lambda}(t,t)}\right) \end{equation}
where the supscript $_{(t)}$ means except $t$th one, while $_t$ means the $t$th one.
Could anyone help finish the proof, or using some other method to prove the formula? I'd be grateful.
Helpful post:
@MISC {164223, TITLE = {Proof of LOOCV formula}, AUTHOR = {Clarinetist (https://stats.stackexchange.com/users/46427/clarinetist)}, HOWPUBLISHED = {Cross Validated}, NOTE = {URL:Proof of LOOCV formula (version: 2015-08-04)}, EPRINT = {https://stats.stackexchange.com/q/164223}, URL = {https://stats.stackexchange.com/q/164223} }