2

So I'm reading https://arxiv.org/pdf/1509.09169.pdf on ridge regression. On page 8 under Example 1.3 it says

From the figure it is obvious that for any $\lambda >0$ the ‘ridge fit’ $\widehat{Y}(\lambda)=X(X^{\top}X+\lambda I_p)^{-1}X^{\top}Y$ is not orthogonal to the observation $Y$. In other words, the ‘ridge residuals’ $Y − \widehat{Y}(\lambda)=(I_p-X(X^{\top}X+\lambda I_p)^{-1}X^{\top})Y$ are not orthogonal to the fit $\widehat{Y}(\lambda)$.

However my linear algebra is quite rusty. Does that mean that $\langle \widehat{Y}(\lambda), Y− \widehat{Y}(\lambda)\rangle=0$ or are they refering to something else?

chl
  • 53,725
statman
  • 245

1 Answers1

1

That's right. In the case of ordinary lieast squares (OLS), $Y - \hat{Y} = (I-X(X'X)^{-1}X')Y$ and the fact that

$$(I-X(X'X)^{-1}X')X = 0$$

implies that $Y - \hat{Y}$ is orthogonal to any linear combination of the $X$ (and hence to $\hat{Y} = X\hat{\beta}$). However, it is no longer true that

$$(I-X(X'X+ \lambda I)^{-1}X')X = 0$$

so orthogonality to $\hat{Y}(\lambda)$ cannot be asserted of ridge residuals.

F. Tusell
  • 8,608
  • 23
  • 36
  • Okay so it is enough to show that $Y-\widehat{Y}(\lambda)$ is not orthogonal to $X$ since $\widehat{Y}(\lambda)$ is a linear combination of $X$? – statman Oct 29 '20 at 18:02
  • Right. $\hat{Y}(\lambda)$ is indeed in the linear space spanned by the columns of $X$. – F. Tusell Oct 29 '20 at 18:52