I have been reading around ridge regression and have come across two forms of $\hatβ$ in textbooks. Am I correct in believing that $(X^TX+\lambda I)^{-1} X^TY$ is the same as $RSS + \sum_{j=1}^{p} \beta_j^2$? Also, if they are equivalent, how is the best way to picture this?
Asked
Active
Viewed 60 times
2
-
Have a look here: https://stats.stackexchange.com/questions/69205/how-to-derive-the-ridge-regression-solution, especially here: https://stats.stackexchange.com/a/266986/224077 – Peter Dec 18 '22 at 11:30
-
2I believe $\hat{\beta}$ is $(X^TX+\lambda I)^{-1}X^T Y$ (you forgot to transpose the last $X$). – Peter Dec 18 '22 at 11:31
-
1I wouldn't believe any equation in which varying $\lambda$ can alter the left hand side but $\lambda$ does not appear on the right hand side. – whuber Dec 18 '22 at 21:22
1 Answers
3
The first term is $\hat\beta$ itself, the second term is the objective function that is minimised by it (so the second one is not $\hat\beta$, although required to define it). In fact it's not exactly the objective function, as factor $\lambda$ is missing before the sum (it's the objective function for $\lambda=1$).
Christian Hennig
- 23,655