Subgroup analysis: post hoc test interpretation

Question

I am modeling treatment effect in a hypothetical case where only a subset of the sample has disease-related impairment on the outcome of interest. I only expect treatment effects in this subset, but due to variability in the outcome (natural variability + measurement noise), the baseline value of the outcome measure gives me imperfect information about which subjects are impaired (but lower values reflect higher probability of impairment).

My model is val ~ TRT*baseline, and fitting this model gives 1) a positive main effect of treatment, 2) a negative main effect of baseline, and 3) a negative TRT:baseline interaction (larger effect of TRT for those with low baseline scores).

Using this model, I would like to run a post hoc test that gives me a single effect estimate (& hypothesis test) for individuals that are impaired. I think that would be TRT + -1*TRT:baseline, which would be interpreted as the estimated effect size for a subject with a baseline score of -1. Is that right?

I'm open to suggestions for very different ways to model this scenario, but I'm primarily interested to know that I am not misinterpreting this model.

Here is a simulation in R:

library(MASS)
library(multcomp)
library(tidyverse)
Parameters
n = 10000
trt_efct = 0.5
impairment_prob = 0.5
impairment_ef_sz = 1
res_corr = matrix(
  c(
    1.0, 0.8,
    0.8, 1.0
  ),
  2,2
)
Simulate data
df = data.frame(MASS::mvrnorm(
  n, 
  mu=c(0,0), 
  Sigma=res_corr
)) %>%
  dplyr::mutate(
    subject = 1:n,
    arm = c(rep("TRT", n/2), rep("PBO", n/2)),
    im.flag = stats::runif(n, min = 0, max = 1) >= impairment_prob,
    baseline = dplyr::if_else(im.flag, X1 - impairment_ef_sz, X1),
    val = dplyr::case_when(
      im.flag == 1 & arm == "TRT" ~ (X2 - impairment_ef_sz*(1 - trt_efct)),
      im.flag == 1 & arm == "PBO" ~ (X2 - impairment_ef_sz),
      im.flag == 0 ~ X2
    )
  )
Fit baseline interaction model: arm * baseline
model_fit = stats::lm(
    val ~ arm*baseline,
    data = df,
  )
summary(model_fit)
summary(multcomp::glht(model_fit, linfct = matrix(c(0,1,0,-1), nrow=1)))

Using a baseline value as a predictor of a change score is not a good idea. See this page and this page. — EdM, Apr 17 '23 at 20:14

score 1 · Answer 1 · answered Apr 23 '23 at 17:34

If your understanding of the subject matter indicates that a baseline score of -1 represents "impaired" individuals, then what you propose is correct (given the ordering of the coefficient estimates in your model_fit).

A display of the continuous association of outcome with baseline score as a function of treatment might be more useful to your readers. The emmeans package, for example, simplifies this type of post-modeling calculation without your having to keep track of coefficient numbering.

Although outside my expertise, this situation seems like it might benefit from some joint modeling of baseline scores among "impaired" and "non-impaired" individuals along with your hypothesis that the treatment effect is restricted to the "impaired."

Subgroup analysis: post hoc test interpretation

Parameters

Simulate data

Fit baseline interaction model: arm * baseline

1 Answers1