This might be a very beginner level question. I have fitted a model with glmmTMB R package, considering the interactive effect of mean wind speed and average daily rain as well as an interactive effect of mean temperature and mean relative humidity on plant disease severity. Here is my fitted model
mod1 <-
glmmTMB(disease_severity ~ mean_rh * mean_temp + mean_ws * avg_daily_rain + (1|year),
family = beta_family(), data = dat_seasonal)
summary(mod1)
Here is the model output.
Formula: disease_severity ~ mean_rh * mean_temp + mean_ws * avg_daily_rain +
(1 | year)
Data: dat_seasonal
AIC BIC logLik deviance df.resid
-41.9 -27.4 29.9 -59.9 28
Random effects:
Conditional model:
Groups Name Variance Std.Dev.
year (Intercept) 0.7874 0.8874
Number of obs: 37, groups: year, 11
Dispersion parameter for beta family (): 7.03
Conditional model:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -336.9044 151.8930 -2.218 0.0266 *
mean_rh 4.3731 1.9177 2.280 0.0226 *
mean_temp 22.8305 9.9872 2.286 0.0223 *
mean_ws -19.8944 11.1741 -1.780 0.0750 .
avg_daily_rain -3.7751 2.5388 -1.487 0.1370
mean_rh:mean_temp -0.2854 0.1258 -2.269 0.0232 *
mean_ws:avg_daily_rain 5.0808 2.3155 2.194 0.0282 *
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
For reporting,
I would like to report my model results. Can I say that the predictors mean_rh and mean_temp had a significant positive effect individual effect (effect on their own). The predictors mean_ws and avg_daily_rain had no significant effect on their own? There was a signficant negative interaction effect of mean_rh & mean_temp as well as a significant positive interaction effect of mean_ws and avg_daily_rain.
For plotting,
I just need to plot interactive terms of my model. Plotting individual effects alone would be misleading. For example, if I'm using ggeffects package.
Figure 1 <- plot(ggpredict(mod1, c("mean_rh", "mean_temp"))) +
theme_pubclean() +
labs(x = "Mean relative humidity", title = "")
Figure 2 <- plot(ggpredict(mod1, c("mean_ws", "avg_daily_rain"))) +
theme_pubclean() +
labs(x = "Mean wind speed", title = "")
In summary, it's okay to report individual as well as interactive effects (main & interaction effect). But it is NOT okay to plot main effects alone in my case. Is that right?
Edit:
Thank you very much for the feedback @EdM. I am editing the question based on the test you suggested, so all can benefit. I ran car::Anova(mod1) on my model and I got the following output.
Analysis of Deviance Table (Type II Wald chisquare tests)
Response: disease_severity
Chisq Df Pr(>Chisq)
mean_rh 0.0535 1 0.817057
mean_temp 0.2366 1 0.626650
mean_ws 1.1419 1 0.285252
avg_daily_rain 9.8801 1 0.001671 **
mean_rh:mean_temp 5.1498 1 0.023249 *
mean_ws:avg_daily_rain 4.8148 1 0.028217 *
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
This output suggest significant interaction effect of the interactive terms as shown in the original model above. However, it doesn't give a positive or a negative sign, but I assume there is a significant negative interaction effect of mean_rh:mean_temp and a significant positive effect of mean_ws:avg_daily_rain as shown in the original model above? So if need to report this, I'd report that there was a positive main effect of avg_daily_rain and significant interaction effect of mean_rh:mean_temp & mean_ws:avg_daily_rain. Is that right?
Regarding overfitting, I fitted the same model (but excluded random effect of year) using beta regression as shown below.
mod2 <-
betareg(disease_severity ~ mean_rh * mean_temp + mean_ws * avg_daily_rain,
data = dat_seasonal)
summary(mod2)
The output of interactive effects is the same as mod1 (above model with random effect)
Call:
betareg(formula = disease_severity ~ mean_rh * mean_temp + mean_ws * avg_daily_rain,
data = dat_seasonal)
Standardized weighted residuals 2:
Min 1Q Median 3Q Max
-1.8117 -0.9641 -0.0679 0.9726 1.8027
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -439.66169 87.71840 -5.012 5.38e-07 ***
mean_rh 5.65518 1.11886 5.054 4.32e-07 ***
mean_temp 29.66389 5.84693 5.073 3.91e-07 ***
mean_ws -23.39753 6.14241 -3.809 0.000139 ***
avg_daily_rain -4.89648 1.40426 -3.487 0.000489 ***
mean_rh:mean_temp -0.36917 0.07362 -5.015 5.31e-07 ***
mean_ws:avg_daily_rain 6.39018 1.31982 4.842 1.29e-06 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 3.7717 0.8953 4.213 2.52e-05 ***
Signif. codes: 0 '*' 0.001 '' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 24.96 on 8 Df
Pseudo R-squared: 0.5381
Number of iterations: 358 (BFGS) + 56 (Fisher scoring)
When I ran car::Anova(mod2). I got the following output
Analysis of Deviance Table (Type II tests)
Response: disease_severity
Df Chisq Pr(>Chisq)
mean_rh 1 0.5807 0.446035
mean_temp 1 3.7047 0.054259 .
mean_ws 1 9.0713 0.002597 **
avg_daily_rain 1 32.5315 1.173e-08 ***
mean_rh:mean_temp 1 25.1495 5.305e-07 ***
mean_ws:avg_daily_rain 1 23.4420 1.287e-06 ***
Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I'm happy to use the second model if ignoring year is not an issue, and increases the reliability of results. Interaction effects are the same in both models, but wind speed became significant too after running car::anova on the second model. The study was conducted in 11 years.
car::Anova()test is something different, a "chunk" test on all coefficients involving the predictor. That' type of combined test is important for any type of model where a single predictor is involved in multiple terms (e.g., non-linear terms, interactions). With only 2-way interactions you will get the same p-values fromcar::Anova()and the model summary, as you saw, but that's not true if there are higher-order interactions. – EdM Jan 26 '23 at 01:07car::Anova(mod2), I can only talk about the significant interaction effect ofmean_rh:mean_temp&mean_ws:avg_daily_rainNOT significant main effect ofmean_ws&avg_daily_raineven there are askteric(*) opposite to them. The car::Anova() just tells me that if including that term helps to reduce the unexplained variance? The original model tells us the effect on the response variable, the anova tells us on the explained variance. Is that right? – Ahsk Jan 26 '23 at 02:42car::Anova()result is the best estimate of whether such a predictor is associated overall with outcome. On that basis,mean_ws&avg_daily_rainare significantly associated with outcome in your last model. – EdM Jan 26 '23 at 13:18year, based on your understanding of the subject matter. If you ignore it, you are assuming that all 37 observations are independent of each other, with no correlations among them based on things like year or season. Is that a reasonable assumption? If not, the more conservativemod1shows a substantial association by Anova ofavg_daily_rainand of two interactions with outcome, which might be a safer interpretation of your results. Also, I'm still worried about whether you are over-fitting your data. – EdM Jan 26 '23 at 13:35betar()family to check main effects. I hope the splines will somewhat account for theyeareffect. For interaction effect, I will only fit one interaction(e.g., mean_rh*mean_temp)at time using glmmTMB including year as a random effect instead of including two interactions as I have done in model. Thanks very much for your time. – Ahsk Jan 26 '23 at 14:07