Interpret results of a mixed effects model analyzing a 2x2 cross-over study?

Question

I have conducted a randomized 2x2 cross-over trial of 8 participants measuring the effect of a specific diet (intervention) vs normal diet (control) on the number of sleep hours.

The study design includes one week periods: run-in, intervention/control, washout, intervention/control, and then washout. Thus in total 5 weeks. Thus:

	Period 1	Period 2
Sequence AB	Treatment A	Treatment B
Sequence BA	Treatment B	Treatment A

The data looks like this:

subject	sleep_hours	sequence	period	treatment
1	4.3	AB	runin	0
2	6.5	AB	runin	0
3	5.2	AB	runin	0
4	4.4	AB	runin	0
5	4.2	BA	runin	0
6	6.5	BA	runin	0
7	5.2	BA	runin	0
8	4.6	BA	runin	0
1	5.2	AB	1	A
2	4.1	AB	1	A
3	6.5	AB	1	A
4	4.4	AB	1	A
1	7.1	AB	2	B
2	8.7	AB	2	B
3	6.5	AB	2	B
4	7.4	AB	2	B
5	7.2	BA	1	B
6	8.3	BA	1	B
7	6.9	BA	1	B
8	7.4	BA	1	B
5	4.8	BA	2	A
6	5.1	BA	2	A
7	4.2	BA	2	A
8	6.6	BA	2	A

Here is a reproducible code of the above data:

db <- structure(list(subject = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 
2L, 3L, 4L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 5L, 6L, 7L, 8L), 
sleep_hours = c(4.3, 6.5, 5.2, 4.4, 4.2, 6.5, 5.2, 4.6, 5.2, 
4.1, 6.5, 4.4, 7.1, 8.7, 6.5, 7.4, 7.2, 8.3, 6.9, 7.4, 4.8, 
5.1, 4.2, 6.6), sequence = structure(c(1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L), levels = c("AB", "BA"), class = "factor"), 
period = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L
), levels = c("1", "2", "runin"), class = "factor"), 
treatment = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L), levels = c("0", "A", "B"
), class = "factor")), row.names = c(NA, -24L), 
class = "data.frame")

I used a Linear mixed-effects model like to see if the intervention (treatment A) affects sleep_hours compared with the control (treatment B):

install.packages(lme4)
install.packages(lmerTest)
model <- lmer(sleep_hours ~ treatment * period + sequence + 
              (1|subject), data = db)
summary(model)

This gave these results:

Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: sleep_hours ~ treatment * period + sequence + (1 | subject)
   Data: db
REML criterion at convergence: 57.7
Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.0220 -0.6944 -0.1703  0.3407  1.5199
Random effects:
 Groups   Name        Variance Std.Dev.
 subject  (Intercept) 0.0000   0.000

 Residual             0.9101   0.954

Number of obs: 24, groups:  subject, 8
Fixed effects:
                                Estimate            Std. Error                    df t value      Pr(>|t|)

(Intercept)         5.100000000000009415  0.477005998098789574 17.999999999553889296  10.692 0.00000000316 ***
treatmentA         -0.050000000000007615  0.674588351844622736 17.999999999553857322  -0.074       0.94173

treatmentB          2.325000000000000178  0.674588351844622736 17.999999999553857322   3.447       0.00288 ** 
period2            -0.000000000000009721  0.954011996197578593 17.999999999553843111   0.000       1.00000

sequenceBA          0.024999999999988885  0.674588351844622736 17.999999999553867980   0.037       0.97085

treatmentA:period2  0.100000000000017311  1.652397248444412492 17.999999999616743906   0.061       0.95241

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
            (Intr) trtmnA trtmnB perid2 sqncBA
treatmentA  -0.707

treatmentB   0.000  0.000

period2     -0.500  0.354 -0.707

sequenceBA  -0.707  0.500 -0.500  0.707

trtmntA:pr2  0.577 -0.612  0.612 -0.866 -0.816
fit warnings:
fixed-effect model matrix is rank deficient so dropping 4 columns / coefficients
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

I also tried to extract the P value and 95% confidence intervals for the effects of treatment A, inspired by the code of @BenBolker here:

model_p <- pvalue(coef(summary(as(model, 
                   "lmerModLmerTest")))[2,5])
model_ci <- paste0(ciformat1(coef(summary(as(model, 
  "lmerModLmerTest")))[2,1]), " (", ciformat1(confint(model, 
  method="Wald")[4,1]), " to ", ciformat1(confint(model, 
  method="Wald")[4,2]), ")")

Which gave these results:

	Intervention period	P
Sleep hours (hours)	-0.1 (-1.4 to 1.3)	.942

My interpretation: treatment A did not affect sleep differently than treatment B, mean difference -0.1 hours (95% CI -1.4 to 1.3 hours), P = .942.

Is my code and interpretation correct?

Thank you! I thought code related questions, including interpretation of the test statistics (which depends on the coding), were supposed to be asked at stackoverflow? Sorry if I misunderstood this. — Wandering_geek, May 27 '23 at 16:57
The statisticians will understand the coding. If you had been trained properly you should be able to understand what the output meant whether done with R, SAS, or Stata. — DWin, May 27 '23 at 17:03

score 1 · Answer 1 · answered Jun 05 '23 at 21:22

Is my code and interpretation correct?

No! Your model has 2 large issues and the parameter you have extracted means something very different then what you think.

The issues with your model:

It is singular, aka the random effect variance is 0, which means you did not adjust for variation between subjects. This is a whole thing (see here: https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#singular-models-random-effect-variances-estimated-as-zero-or-correlations-estimated-as---1 ) but, in my opinion, with only 8 subjects and 3 observations per subject it is hopeless. Just use an "unmixed" linear model and keep the limitation in mind.
You are requesting redundant and impossible fixed effects. When you put treatment * period your model also wants to fit a parameter for period = runin and treatment = A or B which it can't because at period = runin there is only treatment = 0. Also sequence is the relationship between period and treatment making it redundant. You should try to express your model minimally and understand what the terms in the formula mean. You could start here: https://stats.stackexchange.com/a/122251/341520

Now one way to express your model well is to just look at period*sequence , because if we know which both of those, we also automatically know the treatment.

library(tidyverse)
m2 <- lm(sleep_hours ~  period * sequence, data = db %>% 
           mutate(period = factor(period, levels = c("runin", "1", "2"))) # fixing baseline at runin
         )
> summary(m2)
Call:
lm(formula = sleep_hours ~ period * sequence, data = db %>% mutate(period = factor(period, 
    levels = c("runin", "1", "2"))))
Residuals:
    Min      1Q  Median      3Q     Max 
-0.9750 -0.6625 -0.1625  0.3250  1.4500
Coefficients:
                   Estimate Std. Error t value Pr(>|t|)

(Intercept)          5.1000     0.4770  10.692 3.16e-09 ***
period1             -0.0500     0.6746  -0.074  0.94173

period2              2.3250     0.6746   3.447  0.00288 ** 
sequenceBA           0.0250     0.6746   0.037  0.97085

period1:sequenceBA   2.3750     0.9540   2.489  0.02280 *

period2:sequenceBA  -2.2750     0.9540  -2.385  0.02830 *

Signif. codes:  0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.954 on 18 degrees of freedom
Multiple R-squared:  0.6379,    Adjusted R-squared:  0.5373 
F-statistic: 6.343 on 5 and 18 DF,  p-value: 0.001462
> confint(m2)
                        2.5 %     97.5 %
(Intercept)         4.0978476  6.1021524
period1            -1.4672575  1.3672575
period2             0.9077425  3.7422575
sequenceBA         -1.3922575  1.4422575
period1:sequenceBA  0.3706952  4.3793048
period2:sequenceBA -4.2793048 -0.2706952

So we can see that Treatment B slept 2.3 hours more with differing levels of statistical significance depending on if we compare it to runin or Treatment A. However the standard errors are large it's just that the effect is larger.

I love using predict (see help(predict.lm)) to understand what my model actually means so let's plot that:

predict(m2, interval = "confidence")
db2 <- cbind(db, predict(m2, interval = "confidence"))
ggplot(db2 %>% 
         select(- sleep_hours, -subject) %>% 
         distinct() %>% 
         mutate(period = ordered(period, levels = c("runin", "1", "2"))) %>%

         arrange(period), 
       aes(y = fit, x = period, group = factor(sequence))) +
  geom_path() +
  geom_errorbar(aes(ymin = lwr, ymax = upr), width = 0.2) + 
  geom_point(aes(color = factor(treatment)), size = 3)

Now let's compare our fit to the data

ggplot(db %>% 
         mutate(period = ordered(period, levels = c("runin", "1", "2"))) %>%   
         arrange(period), 
       aes(y = sleep_hours, x = period, group = factor(subject))) +
  geom_path() +
  geom_point(aes(color = factor(treatment)), size = 3)

Pretty good, Id's say.

Thank you Lukas. I would like to add that my true data has 30 individuals rather than 8. The above was simply a "reproducible example" as required by StackExchange. Could lmer() make more sense now that you know that I have 30 individuals? — Wandering_geek, Jun 06 '23 at 16:27
First of, thank you for providing the reproducible example. Not everybody does it, but it always helps a lot. Whether your model with 30 subjects will be singular is hard to predict, but if it is you would have good evidence that the subject variance is very small. Sadly you can't use predict to calculate CI for lmer-models. You could however use something less pretty like visreg(m3,xvar = "period" ,by = "sequence", type = "contrast") — Lukas Lohse, Jun 06 '23 at 22:07
Thank you Lukas. But if I would use lmer for the main result, I could use the equation lmer(sleep_hours ~ period*sequence + (1|subject), data = db)? And in that case, is it the line named "period1:sequenceBA" that gives me the P value for whether or not the treatment had an effect on sleep hours? — Wandering_geek, Jun 10 '23 at 09:14
It would maybe be easier to interpret with something like m3 <- lmer(sleep_hours ~ treatment * sequence + (1|subject), data = db %>% mutate(treatment = factor(treatment, levels = c("A", "B", "0"))) # fixing baseline at Treatment A ) — Lukas Lohse, Jun 16 '23 at 21:25
Now your comparisons are against treatment A and you can directly see differnce cused by different sequencing. — Lukas Lohse, Jun 16 '23 at 21:30

Interpret results of a mixed effects model analyzing a 2x2 cross-over study?

1 Answers1