I am using a retrospective data set that looks at cancer patients' progression-free survival, and trying to determine if there is a difference between the treatment group and the control group. The grade of cancer though is a confounding variable, for which patients with high grade disease received treatment 80% of the time, and patients with low grade disease received treatment 20% of the time.
I am trying to compare progression-free survival rates using a Cox proportional hazards model between the treatment group and the control group and control for confounding from disease grade (low vs. high).
I am currently using inverse probability weighting to create ATE weights, and using these weights in the cox proportional hazards model (using R). I was wondering if it would be just as effective to create a CPH model with two covariates- both treatment group and disease grade, which I believe should also help control for the confounding for disease grade. Will both of these methods accomplish the same thing, and is one method more accepted than the other?
I have posted some example code below.
library('survival')
Set the seed to ensure reproducibility
set.seed(1234)
Create a vector of time ranging from 1 month to 5 years
time <- sort(c(runif(100, 1, 12), runif(100, 12, 60)))
Create a vector of treatment group
tx <- rbinom(length(time), 1, 0.5)
Create a vector of grade group
grade <- factor(ifelse(tx == 1, sample(c(rep("high", 80), rep("low", 20)), length(time), replace = TRUE),
sample(c("high", "low"), length(time), replace = TRUE)))
Create a vector of binary event representing disease progression
event <- rep(0, length(time))
event[grade == "high"] <- rbinom(sum(grade == "high"), 1, 0.8)
event[grade == "low"] <- rbinom(sum(grade == "low"), 1, 0.2)
Create the dataset by combining the vectors
dat <- data.frame(time, event, tx, grade)
#Generate propensity score weights
mod.glm <- glm(tx ~ grade, data = dat, family = binomial);
dat$ps <- predict(mod.glm, type = "response");
dat$weight <- ifelse(dat$tx ==1, 1/dat$ps, 1/(1-dat$ps) );
#Unadjusted CPH model
mod.cph <- coxph(Surv(time,event)~ tx, data = dat);
#CPH model using IPW
mod.cph.ipw <- coxph(Surv(time,event)~ tx, data = dat, weights = weight);
#Multivariate CPH model
mod.cph.multi <- coxph(Surv(time,event)~ tx + grade, data=dat);
```