Difference-in-differences vs controlling for baseline imbalance on outcome

Question

Let's say we have data from a small 2-arm pilot trial with baseline imbalance. I wish to compare two approaches to the analysis:

Regression of endline (post-treatment) outcome data on an indicator of study arm that controls for baseline values on the outcome of interest
Difference-in-differences

Here's some toy data that mimics a real example.

# function for simulating data with fixed parameters
# https://stackoverflow.com/a/19343398/841405
  mysamp <- function(n, m, s, lwr, upr, nnorm) {
    samp <- rnorm(nnorm, m, s)
    samp <- samp[samp >= lwr & samp <= upr]
    if (length(samp) >= n) {
      return(sample(samp, n))
    }  
    stop(simpleError("Not enough values to sample from. Try increasing nnorm."))
  }
# load packages
library(tidyverse)

# simulate baseline and endline data based on real-world example
# some loss-to-followup at endline
  set.seed(42)
  base.t <- mysamp(n=32, m=3.03, s=0.46, lwr=0, upr=24, nnorm=1000)
  base.c <- mysamp(n=32, m=4.53, s=0.52, lwr=0, upr=24, nnorm=1000)
  end.t <- mysamp(n=23, m=2.21, s=0.63, lwr=0, upr=24, nnorm=1000)
  end.c <- mysamp(n=22, m=2.23, s=0.39, lwr=0, upr=24, nnorm=1000)

# create long data
  dat <- data.frame(id=c(seq(1:32),           # control, baseline
                         seq(1:22),           # control, 3 month
                         seq(from=33, to=64), # treatment, baseline
                         seq(from=33, to=55)),# treatment, 3 month
                    trt=c(rep(0, 32+22),      # control
                          rep(1, 32+23)),     # treatment
                    end3mo=c(rep(0, 32),      # control, baseline
                             rep(1, 22),      # control, 3 month
                             rep(0, 32),      # treatment, baseline
                             rep(1, 23)),     # treatment, 3 month
                    score=c(base.c,           # control, baseline
                            end.c,            # control, 3 month
                            base.t,           # treatment, baseline
                            end.t))           # treatment, 3 month
# reshape wide
  datw <- 
    dat %>%
    mutate(end3mo = case_when(end3mo==1 ~ "end3mo",
                              TRUE ~ "baseline")) %>%
    group_by(end3mo) %>%
    spread(end3mo, score)

Here's the result for the first approach, regressing endline outcome scores on an indicator of study assignment and controlling for baseline data.

# controlling for baseline 
  summary(lm(end3mo ~ trt + baseline, data=datw))

#Coefficients:
#            Estimate Std. Error t value Pr(>|t|)   
#(Intercept)  2.69332    0.87332   3.084   0.0036 **
#trt         -0.19323    0.31856  -0.607   0.5474   
#baseline    -0.09929    0.19330  -0.514   0.6102   
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Here's the result of the second approach, difference-in-differences:

# difference in differences estimate (interaction)
  summary(lm(score ~ trt*end3mo, data=dat))

#Coefficients:
#            Estimate Std. Error t value Pr(>|t|)    
#(Intercept)  4.55646    0.09029  50.467  < 2e-16 ***
#trt         -1.46102    0.12768 -11.443  < 2e-16 ***
#end3mo      -2.30747    0.14145 -16.313  < 2e-16 ***
#trt:end3mo   1.40694    0.19875   7.079  1.7e-10 ***
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Results:

Coefficient on trt is is -0.19323 (Not shown: LOCF for missing data at endline gives trt of 0.39223)
DID estimate (trt:end3mo) is 1.4

The DiD estimate is large and positive in this example because the control group had a large decrease and the treated did not decrease as much. Whether you go with the DiD model is likely dependent on how much you believe in the parallel slopes assumption of those two trends. — Andy W, Nov 08 '17 at 19:27
If randomized to these arms, parallel slopes would be a reasonable assumption, right? — Eric Green, Nov 08 '17 at 19:33
Yes, in that case it should be reasonable - although if they were randomized they hopefully should not be very imbalanced on the outcome measure. — Andy W, Nov 08 '17 at 20:00
You also have a weak relationship between the two time-points. The stronger this relationship, the more similar these two approaches, as the DiD model assumes a coefficient of 1 for the baseline measure. This coefficient is -0.099 in your example (which is rather strange). — dbwilson, Nov 08 '17 at 21:02
That's a really good point, @dbwilson. I actually just had a conversation with a colleague who pointed out the need to add this to the simulation. I am just taking reported means/sd/n for each round and generating data. — Eric Green, Nov 08 '17 at 21:43
sticking a note here for myself to go back and generate correlated data: https://cran.r-project.org/web/packages/simstudy/vignettes/simstudy.html — Eric Green, Nov 08 '17 at 22:13

Difference-in-differences vs controlling for baseline imbalance on outcome

0 Answers0