1

Suppose I have two variables, x (marker indicator = 0, 1) and trt (treatment indicator = 0, 1). I fit an interaction model using Cox PH regression with the linear part: $\beta_1$*$x$ + $\beta_2$*$trt$ + $\beta_3$*$x*trt$. The test p-value $p_1$ for the treatment effect (trt), $\beta_2$, is obtained regardless of the marker value (i.e., all subjects). Another test p-value $p_2$ is for the treatment effect (trt) when the marker value is restricted to the lower group subjects ($x=0$). It turns out that the two p-values, $p_1$ and $p_2$, are identical. Is there a way to prove it theoretically? Thank you!

alittleboy
  • 1,013

1 Answers1

2

I don't think that this generally holds true. In this situation, the standard errors, and thus the p-value, will not always be the same. If you have few decimals (e.g. the output in your statistics software displays < 0.001) it may be the case that $p_1$ and $p_2$ aren't really identical, but both are low. Take a look at the standard errors of the estimates in the full model and the sub group analysis of those subjects in which x = 0, and you will probably see that the standard errors differ slightly.

EDIT:

I think that with large sample sizes, the estimate and standard errors in such a scenario might converge to be the same. Some simulation experiments I just did indicate that they do, but someone that is mathematically skilled might prove it theoretically.

EDIT2:

Just a quick demonstration of the simulation code that I ran. It's not very formal, but if you try changing the value of n to different values you will see that the lower the n value, the more different the coefficient and standard errors are in models m1 and m2, and when n is high, they're virtually identical:

n <- 30000
x1 <- rbinom(n,1,0.5)
x2 <- rbinom(n,1,0.5)

base.event.prob <- 0.001
x1.prob <- 0.007
x2.prob <- 0.007
interaction.prob <- 0.05
cens.prob <- 0.001

time <- NULL
event <- NULL
remain <- rep(1, n)
for (i in 1:n) {
  for (j in 1:365) {
    if (remain[i]==1) {
      death <- rbinom(1,1,base.event.prob+x1.prob*x1[i]+x2.prob*x2[i]+interaction.prob*x1[i]*x2[i])
      cens <- rbinom(1,1,cens.prob)
      time[i] <- j
      event[i] <- 0
      if (cens == 1) {
        remain[i] <- 0
      }
      if (death == 1) {
        event[i] <- 1
        remain[i] <- 0
      }
    }
  }
}
df <- data.frame(event, time, x1, x2)
m1 <- coxph(Surv(time, event) ~ x1 * x2, data=df)
m2 <- coxph(Surv(time, event) ~ x2, data=df[which(df$x1==0),])
JonB
  • 2,870