2

The lasso coefficients are the ones that minimize $RSS+\lambda \Sigma_{j=1}^{p} |\beta_j|$ whereas the ridge regression coefficients those that minimize $RSS+\lambda \Sigma_{j=1}^{p} \beta_j^2$. I don't quite see from these mathematical expressions alone why lasso will shrink some coefficients to be exactly zero, while ridge regression will shrink all the coefficients towards zero, but will not set any of them equal to zero. Can someone explain or provide some intuition for this please. Thank you.

0 Answers0