Emmeans, pairs, contrasts, and all different p-values

Question

The main regression formula I am using is as follows (Both the Group and Time variables are binary):

lmer.fit <- lmerTest::lmer(corrmcc ~ Group * Time + (1|ID), 
                           data = sm.dat)
summary(lmer.fit)
Fixed effects:
Estimate Std. Error     df t value Pr(>|t|)
(Intercept)         429.92      36.14  65.74  11.896   <2e-16 ***
GroupTTNS           -37.75      51.64  65.74  -0.731    0.467
TimeEoS             -55.68      34.92  40.36  -1.595    0.119
GroupTTNS:TimeEoS   -23.23      50.47  40.73  -0.460    0.648

As you can see, the Time variable (or TimeEoS) has a p-value of > 0.05. That is, there is no significant difference in the outcome between the two time points.

Then I applied the emmeans function with a bar (|) for a post-hoc analysis as follows:

pairs(emmeans(lmer.fit, ~ Time | Group)) 
Group = Control:
 contrast       estimate   SE   df t.ratio p.value
 Baseline - EoS     55.7 35.0 42.0   1.592  0.1189
Group = TTNS:
 contrast       estimate   SE   df t.ratio p.value
 Baseline - EoS     78.9 36.5 42.7   2.161  0.0364

Here, we can observe that for the TTNS group, the difference between the baseline and EoS is statistically significant (i.e., p-value = 0.0364, < 0.05).

However, when I applied the same function but with an asterisk (*), I obtained an opposite result as follows:

pairs(emmeans(lmer.fit, ~ Time * Group))
 contrast                         estimate   SE   df t.ratio p.value
 **Baseline Control - EoS Control       55.7 35.0 42.0   1.592  0.3940**
 Baseline Control - Baseline TTNS     37.8 51.6 66.9   0.731  0.8843
 Baseline Control - EoS TTNS         116.7 53.4 71.2   2.184  0.1376
 EoS Control - Baseline TTNS         -17.9 52.8 69.8  -0.340  0.9864
 EoS Control - EoS TTNS               61.0 54.6 73.6   1.118  0.6797
 **Baseline TTNS - EoS TTNS             78.9 36.5 42.7   2.161  0.1508**

Here, the difference between the baseline and EoS in the TTNS group is not significant with a p-value of 0.15.

I am sure that I am missing something, but I am having trouble finding what is causing the differences. Any suggestions and comments would be greatly appreciated!

score 2 · Accepted Answer · answered Oct 26 '23 at 16:10

2

What you are missing is that emmeans() corrects p values for multiple comparisons. Your first call to the function only involved 2 comparisons; the second call involved 6 comparisons. The extra number of comparisons means that the reported, multiplicity-corrected, p values are larger in the second call even though the estimate, SE, df, and t.ratio values are the same for the 2 comparisons that are the same in both calls.

See this answer by the package's author for discussion about how the package chooses the comparisons to treat as multiple comparisons for this correction. It also links to the package's vignette on comparisons.

Also, consider the advice that the package's author provides in one of the FAQ:

First of all, you should not be making binary decisions of “significant” or “nonsignificant.” This is a simplistic view of P values that assigns an unmerited magical quality to the value 0.05. It is suggested that you just report the P values actually obtained, and let your readers decide how significant your findings are in the context of the scientific findings.

answered Oct 26 '23 at 16:10

EdM

92,183
10
92
267

Thank you so much for your answer, @EdM ! When I added adjust = "none" option (i.e., pairs(emmeans(lmer.fit, ~ Time * Group), adjust = "none")), I was able to obtain the same p-values! – KLee Oct 26 '23 at 17:37
Hi @EdM . I have one follow-up question. In the main model, the p-value for 'Time' is 0.119. Based on this result (although I agree that making a binary decision based on the p-value is a matter of controversy), can we conclude that there is no significant difference in the outcome between the two time points? However, the post-hoc analysis reveals that specifically for the TTNS group, the difference between the baseline and EoS is statistically significant (i.e., an unadjusted p-value of 0.0364). Is this the correct conclusion to report? – KLee Oct 26 '23 at 17:42
Or... Is it incorrect to perform a post-hoc analysis if the interaction term is not significant? – KLee Oct 26 '23 at 18:08
@KLee it's tricky to interpret any of the individual coefficients in a model with interactions. It's possible, for example, for an overall evaluation of Time that includes the contribution from its interaction term to be "significant" even if neither its individual coefficient nor the interaction coefficient are"significant." What's most important here is whether you had a specific hypothesis about Baseline - EoS in TTNS before looking at the data. If so, then there might not be a need for multiple-comparison correction. Otherwise, you should be correcting. – EdM Oct 26 '23 at 18:36
@KLee if your primary interest is in a set of pairwise comparisons, there is no need to evaluate the overall model "significance" or that of any particular coefficients. See this answer from the author of emmeans. – EdM Oct 26 '23 at 18:44

Emmeans, pairs, contrasts, and all different p-values

Fixed effects:

Estimate Std. Error df t value Pr(>|t|)

(Intercept) 429.92 36.14 65.74 11.896 <2e-16 ***

GroupTTNS -37.75 51.64 65.74 -0.731 0.467

TimeEoS -55.68 34.92 40.36 -1.595 0.119

GroupTTNS:TimeEoS -23.23 50.47 40.73 -0.460 0.648

1 Answers1