How to properly test for an interaction effect using a generalised additive model

Question

I am running an analysis on the effect of canopy cover (OverheadCover) and the number of carcasses placed on the same location (CarcassNumber) on the proportion of carrion eaten by birds (ProportionBirdsScavenging). These data points were collected at different national parks, so I aim to include Area as a random factor. Long story short, I want to test for an interaction effect of OverheadCover and CarcassNumber. Test data and analysis below.

library(mgcv)
data_both <- data.frame(ProportionBirdsScavenging = c(0.406192519926425, 0.871428571428571, 0.452995391705069, 0.484821428571429, 0.795866569978245, 0.985714285714286, 0.208571428571429, 0.573982970671712, 0.694285714285714, 0.930204081632653, 0.0483709273182957, 0.0142857142857143, 0.661904761904762, 0.985714285714286, 0.0142857142857143, 0.0142857142857143),
                                   pointWeight = c(233, 17, 341, 128, 394, 46, 5, 302, 10, 35, 57, 39, 12, 229, 28, 116),
                                   OverheadCover = c(0.671, 0.04, 0.46, 0.65, 0.02, 0, 0.8975, 0.585, 0.6795, 0.0418, 0.5995, 0.6545, 0.02, 0, 0.92, 0.585),
                                   CarcassNumber = as.factor(c(1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2)),
                                   Area = c("Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst", "Hamert", "KempenBroek", "KempenBroek", "KempenBroek", "Markiezaat", "Markiezaat", "Meinweg", "Valkenhorst"))

gam_interaction <- mgcv::gam(ProportionBirdsScavenging ~ OverheadCover * CarcassNumber + s(Area, bs="re"), family=betar(link="logit"), data = data_both, weights = pointWeight)
summary(gam_interaction)

# Family: Beta regression(26.515) 
# Link function: logit 
# 
# Formula:
#   ProportionBirdsScavenging ~ OverheadCover * CarcassNumber + s(Area, 
#                                                                 bs = "re")
# 
# Parametric coefficients:
#   Estimate Std. Error z value Pr(>|z|)    
# (Intercept)                   1.20570    0.15236   7.913 2.51e-15 ***
#   OverheadCover                -1.91892    0.12480 -15.376  < 2e-16 ***
#   CarcassNumber2                1.76033    0.05319  33.093  < 2e-16 ***
#   OverheadCover:CarcassNumber2 -8.30140    0.12432 -66.774  < 2e-16 ***
#   ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Approximate significance of smooth terms:
#   edf Ref.df Chi.sq p-value    
# s(Area) 3.792      4  452.9  <2e-16 ***
#   ---
#   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# R-sq.(adj) =  0.889   Deviance explained = 94.9%
# -REML = -2630.1  Scale est. = 1         n = 16

I've read that gam's (because they work additively) do not test very well for interaction effects. However, I also found that you can add some interaction like terms to your model with the by argument, but this only affects the model and doesn't test for interaction. By using the tensor product te() it should work, but CarcassNumber has insufficient unique values. Can somebody advise me on how I should properly test for an interaction effect while using gam? Is the way I did it above (with the * sign) scientifically correct?

score 3 · Accepted Answer · answered Jan 31 '20 at 18:44

At the moment, all you are doing is fitting a GLMM where the conditional distribution of the response is $Y_i \sim \mathrm{Beta}(\mu_i, \Omega)$. You're just using the mgcv machinery and the equivalence of splines and simple random effects as an expedient way to fit the model you want. Note that you have no smooth effects of covariates (beyond the random effects) so this isn't an additive model.

As such, I would look at the Wald-like test in the model summary as one assessment of the interaction effect. The help for summary.gam says the following for parametric terms

By default the p-values for parametric model terms are also based on Wald tests using the Bayesian covariance matrix for the coefficients. This is appropriate when there are "re" terms present, and is otherwise rather similar to the results using the frequentist covariance matrix (freq=TRUE), since the parametric terms themselves are usually unpenalized.

I would also fit the simpler model without the parametric interaction but keeping all other terms as per the model you show and compare the models with AIC(). This latter test will try to account for the selection of the smoothness parameter for the random effect term. Here it isn't really a smoothness parameter but it relates to the degree of shrinkage toward the population effect. As this was selected during fitting, the p values in the summary() output are a little too anti-conservative.

Also note that you should probably fit with method = 'ML' if you really want to take the p values very seriously.

The above all assumes you want to fit linear, parametric effects of the two covariates and their interaction. If you want to estimate potentially non-linear effects of those covariates and their interaction then you can do this quite easily in a GAM with penalised splines as fitted by mgcv, using the tensor product smooth.

The idea that GAMs can't handle interactions because they are additive models is totally out-dated and does not reflect the developments in GAM theory and software over the last few decades, if it ever did apply generally.

An additive model as envisaged by mgcv and similar software is:

$$Y_i \sim \mathcal{D}(\mu_i, \Omega)$$

where the conditional distribution $\mathcal{D}$ of the response given the data $X$ and some additional parameters $\Omega$ (including a scale parameter $\phi$) is a member of the exponential family of distributions or one of a growing number of distributions that aren't normally in the exponential family - like the Beta. What makes a GAM a GAM is that the linear predictor $\eta$ contains one or more smooth fucntions and one or more parametric terms

$$g(\mu_i) = \boldsymbol{A}_i\boldsymbol{\theta} + \sum_{j = 1}^{J} f_j(x_{li}, \dots)$$

where the first part on the right ($\boldsymbol{A}_i\boldsymbol{\theta}$) are the parametric terms (including the model intercept) and the second part is an additive sum of smooth functions of covariates. We assume very little about those smooth functions and we can make smooth functions that are smooth equivalents of interactions between two or more covariates $f(x_{1i}, x_{2i})$ via tensor product smooths. Notice how a tensor product smooth is a single smooth function which represents the equivalent of both the smooth marginal (main) effects of the two covariates plus their smooth interaction. In mgcv this would be te(x1, x2).

For testing if we need the smooth interaction or whether we can get by with the two marginal/main smooth effects only, we can decompose this single smooth into the two marginal smooth effects plus the smooth interaction part; in mgcv this would be

s(x1) + s(x2) + ti(x1, x2)

where the ti(x1, x2) plays the same role as the pure interaction part x1:x2 in a linear model

x1 + x2 + x1:x2

As such, that decomposition gives us a way to directly test for an interaction as you can look at the approximate significance of the ti(x1, x2) term in the output from summary(), or you could use AIC (via AIC()) or a generalised likelihood ratio test (via anova()) on the following models

s(x1) + s(x2)

and

s(x1) + s(x2) + ti(x1, x2)

noting the detail in ?AIC.gam and ?anova.gam respectively.

Thank you for the elaborate answer. I really appreciate it. I have some clarification questions. If I implement the ti() argument as above, r gives me an error (NA/NaN/Inf in foreign function call (arg 1)). I figured that it would probably be because categorical variables as predictors should be noted as fixed effect, right? So I tried s(OverheadCover) + CarcassNumber + ti(OverheadCover, CarcassNumber) which tells me that CarcassNumber has insufficient unique values to support 5 knots: reduce k.. If I reduce the basis functions (k=c(10,2)), the error persists. — Peter, Feb 02 '20 at 08:00
I have the same problem with te() without the Individual univariate effects. BTW, factor smooth basis type bs=“fs” also doesn’t work (Model has more coefficients than data), but this might be due to incomparable units of measure? Or does this mean I have insufficient data to test for interaction? For this test I only have 16 data points. Also, could you perhaps elaborate why is the Maximum Likelihood (ML) method is more appropriate here than the restricted maximum likelihood (REML)? Is this generally the case with the betar family, or just if you test for interaction? — Peter, Feb 02 '20 at 08:01
You can't have more knots that unique data points, so if you want to fit a decomposed version of the smooth interaction you would need s(OverheadCover) + s(CarcassNumber, k = 4) + ti(OverheadCover, CarcassNumber, k = c(5,4)) but even that might not work if you con't have enough unique carcasses. In which case you can't expect to estimate a smooth of CarcassNumber. The same with te(). Likewise with the "fs" basis as this will try to fit k x number of random effects coefficients, so you need to set k low enough for your data. — Gavin Simpson, Feb 03 '20 at 17:00
It's not clear to me that you even want to fit the smooth interaction. If you were happy to fit the linear version then just use the first part of my answer. The second part was there simply to address the oft-repeated comment than GAMs can't fit interactions; they can as I explained here. If you read ?summary.gam it metions the ML vs REML bit as p-values were found to have better coverage using ML, then REML, then other choices in the cited research. You also have to be very careful with REML if you compare models with different fixed effects, which is something I suggested. — Gavin Simpson, Feb 03 '20 at 17:03

How to properly test for an interaction effect using a generalised additive model

1 Answers1

Linked