I am modeling treatment effect in a hypothetical case where only a subset of the sample has disease-related impairment on the outcome of interest. I only expect treatment effects in this subset, but due to variability in the outcome (natural variability + measurement noise), the baseline value of the outcome measure gives me imperfect information about which subjects are impaired (but lower values reflect higher probability of impairment).
My model is val ~ TRT*baseline, and fitting this model gives 1) a positive main effect of treatment, 2) a negative main effect of baseline, and 3) a negative TRT:baseline interaction (larger effect of TRT for those with low baseline scores).
Using this model, I would like to run a post hoc test that gives me a single effect estimate (& hypothesis test) for individuals that are impaired. I think that would be TRT + -1*TRT:baseline, which would be interpreted as the estimated effect size for a subject with a baseline score of -1. Is that right?
I'm open to suggestions for very different ways to model this scenario, but I'm primarily interested to know that I am not misinterpreting this model.
Here is a simulation in R:
library(MASS)
library(multcomp)
library(tidyverse)
Parameters
n = 10000
trt_efct = 0.5
impairment_prob = 0.5
impairment_ef_sz = 1
res_corr = matrix(
c(
1.0, 0.8,
0.8, 1.0
),
2,2
)
Simulate data
df = data.frame(MASS::mvrnorm(
n,
mu=c(0,0),
Sigma=res_corr
)) %>%
dplyr::mutate(
subject = 1:n,
arm = c(rep("TRT", n/2), rep("PBO", n/2)),
im.flag = stats::runif(n, min = 0, max = 1) >= impairment_prob,
baseline = dplyr::if_else(im.flag, X1 - impairment_ef_sz, X1),
val = dplyr::case_when(
im.flag == 1 & arm == "TRT" ~ (X2 - impairment_ef_sz*(1 - trt_efct)),
im.flag == 1 & arm == "PBO" ~ (X2 - impairment_ef_sz),
im.flag == 0 ~ X2
)
)
Fit baseline interaction model: arm * baseline
model_fit = stats::lm(
val ~ arm*baseline,
data = df,
)
summary(model_fit)
summary(multcomp::glht(model_fit, linfct = matrix(c(0,1,0,-1), nrow=1)))