1

I am using hierarchical GAMs to examine the effect of weather covariates (n=21) on annual bird counts. The hierarchical part is to account for nested observations with s(site, bs="re") and temporal autocorrelation with s(year, by=site, m=2). None of my covariates have significant smooths in the global model, so I have fit them all as parametric terms. My model looks as follows:

m = gam(nests ~ x1 + x2 + x3 + .... + x21 + s(site, bs="re") + s(year, by=site, m=2), data=data, family=poisson, method="REML")

My question is: How do I perform selection on GAMs if all covariates of interest are parametric terms? I can't treat the parametric terms as low-degree smooths and use select=TRUE (as recommended here) due to concurvity issues, and I'm apprehensive to use the paraPen argument to penalize the parametric terms because it makes approximate p values unreliable.

Any help is greatly appreciated!

Ryan
  • 33
  • If you're doing that kind of penalization, standard p values in general don't really mean what we think they mean; this is a whole active area of research now under the broad heading of post-selection inference. Go with paraPen (and don't worry about p value just look at effect sizes), go fully Bayesian (but that's hard too), or simply don't bother with selection and just fit the model and evaluate the estimated effects. If the shrinkage is working, you should be looking at the effect sizes anyway and p values are less important (unnecessary?) then. – Gavin Simpson Apr 13 '23 at 22:40
  • 1
    Thanks for your response. What is the argument specification to include paraPen in my model, considering that I have 21 covariates? And when you say only look at the effect sizes, you mean inspect effect sizes to search for those that are not close to - and do not cross - zero? – Ryan Apr 16 '23 at 17:13

0 Answers0