5

Can someone please explain what kind of test emmeans runs when using the pairs function (is it a paired t-test between levels)?

I have a linear mixed effects model with two fixed effects (A, B) and one random effect (C). I fit this with the lme function from the nlme package:

mod = lme(val~ A*B, random = ~1|C, data = df)

For each level of A, I want to perform pairwise comparisons for levels of B to check which one dominates. When I run pairs

mod.emm = emmeans(mod, ~A*B)
pairs(mod.emm, simple = "B")

Does this function take my random effect C into account, the way a paired t-test would?

Thanks!

Merry
  • 255
  • 1
  • 7

3 Answers3

11

Here is an illustration of how the model determines the right test. First, create a toy data set and run both a pooled and a paired t test:

y = c(7,6,9,3,2,6)

t.test(y[1:3], y[4:6], var.equal = TRUE)

Two Sample t-test

data: y[1:3] and y[4:6]

t = 2.4597, df = 4, p-value = 0.06972

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

-0.4722133 7.8055467

sample estimates:

mean of x mean of y

7.333333 3.666667

t.test(y[1:3], y[4:6], paired = TRUE)

Paired t-test

data: y[1:3] and y[4:6]

t = 11, df = 2, p-value = 0.008163

alternative hypothesis: true mean difference is not equal to 0

95 percent confidence interval:

2.232449 5.100884

sample estimates:

mean difference

3.666667

Now, I'm going to create two experimental setups:

foo1 = data.frame(trt = factor(rep(c("A", "B"), each = 3)), 
                  subj = factor(1:6))
foo1
##   trt subj
## 1   A    1
## 2   A    2
## 3   A    3
## 4   B    4
## 5   B    5
## 6   B    6

foo2 = foo1; foo2$subj = factor(c(1:3,1:3)) foo2

trt subj

1 A 1

2 A 2

3 A 3

4 B 1

5 B 2

6 B 3

foo1 has 6 different subjects, each observed once, while foo2 has 3 subjects, each observed twice. Now with each dataset, I will fit mixed models with exactly the same specs:

library(nlme)
mod1 = lme(y ~ trt, random = ~1|subj, data = foo1)
mod2 = lme(y ~ trt, random = ~1|subj, data = foo2)

Now let's test the treatment comparison for each model, again with exactly the same call:

library(emmeans)
pairs(emmeans(mod1, "trt"))
##  contrast estimate   SE df t.ratio p.value
##  A - B        3.67 1.49  4   2.460  0.0697
## 
## Degrees-of-freedom method: containment

pairs(emmeans(mod2, "trt"))

contrast estimate SE df t.ratio p.value

A - B 3.67 0.333 2 11.000 0.0082

Degrees-of-freedom method: containment

Created on 2023-11-06 with reprex v2.0.2

Note that the t test and P value for foo1 is exactly the same as the pooled test; and that these results for foo2 are exactly the same as the paired test. This works because the lme function knows what to do with the relationship between subjects and treatments, and that is what determines the correct test.

If you had three or more treatments in foo1 or foo2, you would obtain several tests, one for each pairwise comparison. And none of them would be the same as either the pooled or paired t tests. That's because the models use all of the data, while the pooled and paired tests use only the data for two treatments at a time. So already we are in a situation where we've moved past those simple hand calculations.

Russ Lenth
  • 20,271
6

Using the model, it figures out what linear combination $c$ of regression coefficients b are needed to estimate the quantities $\theta$ in question. Then it computes that estimate using the regression formula $\hat\theta = c^Tb$ , and its standard error using $SE(\hat\theta) = \sqrt{c^TVc}$ where $V$ is the variance-covariance matrix of $b$, typically obtained from the model via the vcov() function. Then it obtains the $t$ ratio $t = \hat\theta / SE(\hat\theta)$.

When you have fitted a mixed model especially, is not a simple hand-computation formula that you'll find in a basic textbook; that's why you need computer packages to fit the model and obtain these results.

Russ Lenth
  • 20,271
6

I think there might be some confusion coming from two types of "pairs" here.

The "pairs" you seem to be thinking about are pairs of observations within individuals. In your mixed model, pairs of observations within individuals are effectively taken into account by the random intercepts. Ideally, modeling with random intercepts in that situation improves the precision of coefficient estimates, similarly to the way a paired t-test can. See this page for discussion.

The pairs() function in emmeans evaluates pairs of estimated marginal means (EMMs), which are predictions from the model. That can be done for any model type supported by emmeans, whether the model involved random effects or not. The pairs() function just evaluates the differences between all pairs of EMMs that meet the specifications you have set. See this vignette section.

The pairs() function thus doesn't by itself consider paired observations within individuals. It depends on the mixed model to have taken such within-individual pairing into account. It then uses the fixed-effect coefficient estimates and their covariances to get the differences and associated standard errors for all pairwise comparisons of the EMMs that you specify, as the package's author explains in his answer.

EdM
  • 92,183
  • 10
  • 92
  • 267