Here is an illustration of how the model determines the right test. First, create a toy data set and run both a pooled and a paired t test:
y = c(7,6,9,3,2,6)
t.test(y[1:3], y[4:6], var.equal = TRUE)
Two Sample t-test
data: y[1:3] and y[4:6]
t = 2.4597, df = 4, p-value = 0.06972
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.4722133 7.8055467
sample estimates:
mean of x mean of y
7.333333 3.666667
t.test(y[1:3], y[4:6], paired = TRUE)
Paired t-test
data: y[1:3] and y[4:6]
t = 11, df = 2, p-value = 0.008163
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
2.232449 5.100884
sample estimates:
mean difference
3.666667
Now, I'm going to create two experimental setups:
foo1 = data.frame(trt = factor(rep(c("A", "B"), each = 3)),
subj = factor(1:6))
foo1
## trt subj
## 1 A 1
## 2 A 2
## 3 A 3
## 4 B 4
## 5 B 5
## 6 B 6
foo2 = foo1; foo2$subj = factor(c(1:3,1:3))
foo2
trt subj
1 A 1
2 A 2
3 A 3
4 B 1
5 B 2
6 B 3
foo1 has 6 different subjects, each observed once, while foo2 has 3 subjects, each observed twice. Now with each dataset, I will fit mixed models with exactly the same specs:
library(nlme)
mod1 = lme(y ~ trt, random = ~1|subj, data = foo1)
mod2 = lme(y ~ trt, random = ~1|subj, data = foo2)
Now let's test the treatment comparison for each model, again with exactly the same call:
library(emmeans)
pairs(emmeans(mod1, "trt"))
## contrast estimate SE df t.ratio p.value
## A - B 3.67 1.49 4 2.460 0.0697
##
## Degrees-of-freedom method: containment
pairs(emmeans(mod2, "trt"))
contrast estimate SE df t.ratio p.value
A - B 3.67 0.333 2 11.000 0.0082
Degrees-of-freedom method: containment
Created on 2023-11-06 with reprex v2.0.2
Note that the t test and P value for foo1 is exactly the same as the pooled test; and that these results for foo2 are exactly the same as the paired test. This works because the lme function knows what to do with the relationship between subjects and treatments, and that is what determines the correct test.
If you had three or more treatments in foo1 or foo2, you would obtain several tests, one for each pairwise comparison. And none of them would be the same as either the pooled or paired t tests. That's because the models use all of the data, while the pooled and paired tests use only the data for two treatments at a time. So already we are in a situation where we've moved past those simple hand calculations.