glmer problems in seeing all variables

Question

I am trying to run a binomial glmm to understand the relationship between various concentrations of a compound sensed by different castes of ants. We have 5 different compound concentrations (a-e), and 3 different castes (min, med, max). We tested this across 3 colonies (two, six, and seven) to see if they were interested in the compound (1 if yes, 0 if no). We listed id as the individual (there are 761).

Our data sheet is set up like this:

We used lme4 and the glmer command:

m <- glmer(Y ~ Conc * (1 + Caste|Colony), data = d, family = binomial) summary(m, corr = FALSE)

When we run this, the readout we get is:

    summary(m, corr = FALSE)
    Generalized linear mixed model fit by maximum likelihood
    (Laplace Approximation) [glmerMod]
     Family: binomial  ( logit )
    Formula: Y ~ Conc * (1 + Caste | Colony)
    Data: d
AIC      BIC   logLik deviance df.resid 
 1067.2   1118.1   -522.6   1045.2      750
Scaled residuals: 
Min      1Q  Median      3Q     Max 
 -1.4844 -0.9855  0.7613  1.0076  1.4478
Random effects:
  Groups Name        Variance Std.Dev. Corr

   Colony (Intercept) 0.4529   0.6730

    Castemed    0.4387   0.6624   -1.00

    Castemin    0.6416   0.8010   -1.00  1.00
   Number of obs: 761, groups:  Colony, 3
Fixed effects:
        Estimate Std. Error z value Pr(>|z|)

 (Intercept) -0.09754    0.16571  -0.589   0.5561

 Concb       -0.05160    0.23068  -0.224   0.8230

 Concc        0.21758    0.22971   0.947   0.3436

 Concd        0.07658    0.23022   0.333   0.7394

 Conce        0.52070    0.23263   2.238   0.0252 *

Signif. codes:  0 ‘’ 0.001 ‘’ 0.01 ‘’ 0.05 ‘.’ 0.1 ‘ ’ 1
 optimizer (Nelder_Mead) convergence code: 0 (OK)
 boundary (singular) fit: see help('isSingular')

I do see concentration E as being significantly different than the others, however we are not seeing any of the interactions by caste and seem to be missing concentration A. I'm not sure if I need to add additional lines of code to get these interaction comparisons? I also would like to know if colony plays a role on the worker preference for the various concentrations. We are pretty sure that caste plays a role on perceptiveness (ie what concentration level they select for) however I feel like perhaps we are missing a key piece of the code to get this. Any help is appreciated!

You have added these effects as random, there's only variance parameters estimated for them (listed under "random effects"). You also only asked for intercepts by caste, there isn't any interaction. A random effect always has a mean of zero. — PBulls, Mar 08 '24 at 15:40
I think the issue here is that R sets up contrasts for categorical (factor) predictors. There is surely a duplicate somewhere, but I can never find it ... — Ben Bolker, Mar 08 '24 at 21:01

score 1 · Accepted Answer · answered Mar 08 '24 at 17:23

First, in the standard treatment (dummy) coding used in R, the (Intercept) is the estimate at the reference values of the predictors and the regression coefficients are the differences from that estimate associated with the predictors. These are the implications. (1) you aren't "missing concentration A"; that's included in the (Intercept). (2) you do not "see concentration E as being significantly different than the others"; the coefficient for concentration E represents its difference only from the reference level (concentration A).

As you seem to be starting out with this type of analysis, learn to rely less on the summary reports of models and focus more on post-modeling tools that can evaluate results and predictions of models more generally. The car and emmeans packages are good examples that can work with mixed models.

Second, to elaborate on a comment from @PBulls: if you want to evaluate the association of Caste with outcome, you need to include it as a "fixed-effect" predictor in the model. If you also include it as a random slope (as in (1 + Caste | Colony)) you will get an estimate of the variance of the Caste effects among the Colony values. That's not strictly an interaction, but it is a measure of how much Caste effects differ among the Colony values.

Third, if you only have 3 values of Colony you probably should not be treating it as a random effect. See discussion on this page and its links. If you only have a small number of Colony values you might get what you seem to want with a simple binomial regression model with interactions:

glm(Y ~ Conc *  Caste * Colony, data = d, family="binomial")

That will require estimating 45 coefficients including the intercept. At first glance from your coefficient estimates it seems that you have similar numbers in the two Y outcome classes, so that shouldn't pose too much of a problem with overfitting, given your 761 observations.

If you are mostly interested in the Conc * Caste interaction and just want to adjust for baseline differences among the levels of Colony, you could use a simpler model:

glm(Y ~ Conc *  Caste + Colony, data = d, family="binomial")

In the model summaries you will still seem to be missing values for the reference levels of the categorical predictors, but they will be subsumed in the values of (Intercept). You will have to be careful when interpreting individual coefficients for predictors involved in interactions, as their values depend on the coding of the predictors they interact with. Those apparent problems, however, disappear when you do proper post-modeling analysis. This page is one example of what's involved in models with interactions. That question is based on a standard linear regression model, but the principles apply in general.

Also, if you go to a model without random effects, you can do your work with the rms package in R, which nicely combines tools for modeling and post-model analysis.

glmer problems in seeing all variables

1 Answers1