According to GAM, it utilizes a penalized likelihood, which is maximized by penalized iteratively re-weighted least squares (P-IRLS), to obtain parameter estimations. The likelihood is defined as:
The structure is quite close to ridge regression with L2 penalty with an extra S matrix in the penalty term, but from the book, I learned that the S matrix is 0 when we have non-smoothing terms. That is to say, if I specify a model without any smoothing terms, the likelihood will not have any penalization. However, I did some comparative experiment and found out that the GAM results are quite close to ridge regression results, but not completely equivalent.
Did I miss any details in the GAM algorithm that there exists some penalization for non-smoothing terms?

gam, e.g., how to specify aparaPenlist that would be similar to a lambda coefficient inglmnet. If I understand, to penalize the parametric terms we pass an identity matrix toparaPenlike you mentioned (with nrow and ncol the length of the coefficient vector?), but how would the equivalent of lambda be specified withparaPen? Or is that learned during fitting along with the penalty for the smooth terms? – Darren Oct 13 '22 at 01:09paraPen. – Gavin Simpson Oct 22 '22 at 10:26