Significance in ZINB GLMM disappears when par1 * par 2 is used instead of par1 : par2 in R

Question

I am working with a zero-inflated negative binomial model in R, using the glmmTMB package. My main goal is to investigate if there is a significant difference in the amount of times a grassland field was visited by birds (response variable) during different stages of a mowing procedure (treatment: pre, during or post-procedure. One directly follows the other).

A simple model only with these parameters (below) showed significant differences:

zinb_simple <- glmmTMB(
  response ~ treatment
  + (1 | field_id)
  + (1 | bird_id),
  data = df_for_analysis,
  family = nbinom2,
  ziformula = ~.,
  offset = offset,
  control = glmmTMBControl(
    rank_check = "adjust"
  )
)

I am now investigating if any extra parameters potentially influence their response to the treatment. I am using three and all of them display the behaviour in question: period_cut (when in the year the treatment happened: early, mid or late); field_size (size of the field, in m2) and rain (rain in mm during the three stages of treatment). Because the focus of the analysis is on the interaction between a parameter and the treatment, I first wrote a model with parameter : treatment:

zinb_full <- glmmTMB(
  response ~ treatment
  + period_cut: treatment
  + field_size: treatment
  + rain: treatment
  + (1 | field_id),
  data = df_for_analysis,
  family = nbinom2,
  ziformula = ~.
  + (1 | bird_id),
  offset = offset,
  control = glmmTMBControl(
    rank_check = "adjust"
  )
)

bird_id is now only present in zi because the model couldn't converge if bird_id was present in the conditional part. Variance from bird_id in cond was very small (e-08).

This model gave me significancy for some values within the parameters (example from period_cut only for shortness):

Conditional model:
                                           Estimate Std. Error z value Pr(>|z|)
(Intercept)                                4.255e-01  3.867e-01   1.100 0.271205    
treatmentduring_treatment                  6.215e-01  4.099e-01   1.516 0.129520
treatmentpost_treatment                    1.275e+00  4.209e-01   3.031 0.002441 **
treatmentpre_treatment:period_cutearly     -9.418e-01  6.638e-01  -1.419 0.155952
treatmentduring_treatment:period_cutearly  -1.162e+00  4.043e-01  -2.873 0.004068 ** 
treatmentpost_treatment:period_cutearly    -1.184e+00  4.092e-01  -2.893 0.003813 **
treatmentpre_treatment:period_cutlate      -9.891e-01  1.161e+00  -0.852 0.394145
Zero-inflation model:
                                            Estimate Std. Error z value Pr(>|z|)
(Intercept)                                 9.951e-01  4.884e-01   2.038  0.04159 *
treatmentduring_treatment                  -3.600e+00  8.301e-01  -4.336 1.45e-05 ***
treatmentpost_treatment                    -3.161e+00  6.779e-01  -4.663 3.12e-06 ***
treatmentpre_treatment:period_cutearly      5.187e-02  7.610e-01   0.068  0.94566
treatmentduring_treatmentd:period_cutearly  5.120e-01  1.643e+00   0.312  0.75531
treatmentpost_treatment:period_cutearly     5.526e-01  8.171e-01   0.676  0.49886

treatmentpre_treatment:period_cutlate       1.391e+00  1.004e+00   1.386  0.16580

However, all significancy disappears if I use parameter * treatment. The only one that remains is the sole treatment:

Conditional model:
                                            Estimate Std. Error z value Pr(>|z|)
(Intercept)                                 4.255e-01  3.867e-01   1.100  0.27121
treatmentduring_treatment                   6.215e-01  4.099e-01   1.516  0.12952
treatmentpost_treatment                     1.275e+00  4.209e-01   3.031  0.00244 **
period_cutearly                            -9.418e-01  6.638e-01  -1.419  0.15594
period_cutlate                             -9.890e-01  1.161e+00  -0.852  0.39419  
treatmentduring_treatment:period_cutearly  -2.197e-01  7.286e-01  -0.302  0.76296
treatmentpost_treatment:period_cutearly    -2.422e-01  7.575e-01  -0.320  0.74913
treatmentduring_treatment:period_cutlate    1.486e+00  1.186e+00   1.253  0.21027   
treatmentpost_treatmentd:period_cutlate     1.417e+00  1.196e+00   1.185  0.23618
Zero-inflation model:
                                           Estimate Std. Error z value Pr(>|z|)

(Intercept)                                9.951e-01  4.884e-01   2.038   0.0416 *
treatmentduring_treatment                 -3.600e+00  8.301e-01  -4.336 1.45e-05 ***
treatmentpost_treatment                   -3.161e+00  6.779e-01  -4.663 3.12e-06 ***
period_cutearly                            5.181e-02  7.610e-01   0.068   0.9457

period_cutlate                             1.391e+00  1.004e+00   1.386   0.1658

treatmentduring_treatment:period_cutearly  4.604e-01  1.721e+00   0.268   0.7890
treatmentpost_treatment:period_cutearly    5.008e-01  1.042e+00   0.481   0.6308
treatmentduring_treatment:period_cutlate   6.404e-01  1.204e+00   0.532   0.5949
treatmentpost_treatment:period_cutlate    -6.233e-01  1.159e+00  -0.538   0.5908

I tried some seeds but have not been able to generate a working minimum reproducible example. It may be that my model is already working with a small dataset and shrinking it further yields in convergence issues. The table below is just an example of how values looks like:

   bird_id  field_id response treatment period_cut   field_size  rain   offset
   <fct>    <fct>    <dbl>   <fct>      <fct>          <dbl>    <dbl>   <dbl>
 1 koemo3   KM142        0   post_tre…  mid            -19241.  -1.58     1.95
 2 sophie   AL839        0   post_tre…  mid            -18716.   5.61     1.39
 3 nume7    AL539        3   during_t…  early           20483.  -0.148    1.95
 4 sophie   AL1267       0   post_tre…  mid            -18479.  -0.0342   1.95
 5 koemo3   KM88        27   during_t…  mid            -13578.  -1.58     1.95
 6 koemo3   KM585        3   post_tre…  mid              2811.  -1.58     1.95
 7 koemo3   KM652        0   during_t…  mid             16366.  -1.58     1.95
 8 koemo3   KM184        0   post_tre…  late            -9154.   0.0230   1.61
 9 koemo4   KM492        6   post_tre…  mid              5483.  -1.58     1.95
10 wiesmet1 AL834        0   post_tre…  mid            -20968.  -1.58     1.95

Continuous variables have been centered and categorical variables have been factorised. The offset is alreadt at log().

Statistics is not my field. I learnt GLMMs about a month ago and my biostatistics professor doesn't work with GLMMs. His guess is that this may be happening because both treatment and period_cut are temporal variables, but that doesn't explain why field_size * treatment behaves the same way. Furthermore, period_cut refers to a period in the year and the treatment happens "within" period_cut, so in a way these temporalities are different. I cannot see a reason for a relationship between the parameters.

Since I cannot offer a reproducible example, could someone offer a theoretical explanation for the observed behaviour? Why does the significance of parameter : treatment disappear with parameter * treatment?

See this thread https://stats.stackexchange.com/questions/11009/including-the-interaction-but-not-the-main-effects-in-a-model and its answers and comments. If that doesn't answer your questions, say what else you need. — Peter Flom, Oct 08 '23 at 11:06

Significance in ZINB GLMM disappears when par1 * par 2 is used instead of par1 : par2 in R

0 Answers0