Suppose we have a naïve single-arm pretest-posttest design without any control group. Every subject has a pretest score and a posttest score. If we want to determine whether there is a significant difference between the means of pretest and posttest and how large the difference (or effect size) is, we can carry out a paired t-test. However, there are other confounding variables, such as gender and age. How can we control them?
I come up with three models:
- Transform
pretestandposttestinto a factortest(withpreandpostas its possible values) and a continuous variablescore(transform the wide data into long data). Then build the mixed-effects modelscore ~ test + gender + age + (1 | subject). If we omit thegenderandageterms, the result (slope oftest) will exactly be the same as the result of a paired t-test (see Paired data comparison: regression or paired t-test?). - Treat
posttestas the response variable andpretestas a covariate. Build the modelposttest ~ pretest + gender + age(see Repeated measure t test with covariates in R). If we have both a treatment group and a control group, and we want to measure thetreatmenteffect, this (posttest ~ treatment + pretest + gender + age) will be the preferred way to build models (see Best practice when analysing pre-post treatment-control designs). However, what if we have no control group and just want to measure the pretest-posttest difference? And if we indeed use this model, what will be the effect size? (I assume the intercept can be used to calculate the the effect size when the slope ofpretestis 1; what if it isn't 1?). - Build the model
(posttest − pretest) ~ 1 + I(pretest − mean(pretest)) + gender + ageaccording to this paper. The intercept can be used to calculate the effect size.
Which model is the most suitable one? And why? Thanks!