1

I want to conduct a Cox-regression with time-dependent covariate and other control variables and estimate K-M plot with log-rank test result. I will take the heart transplant as an example, some patients died without heart transplant, some patients died after heart transplant. I understand in R, the first step is to create a counting process data format such as:

Id start stop event transplant
1   0     50   1       0
2   0      6   1       0
3   0      1   0       0
3   1     16   1       1
4   0     36   0       0
4  36     39   1       1

Besides those variables, I might still have some other demographic variables that I need to control, such as age, race, salary, but those variables are not time-dependent variables. I saw some codes before conduct the Cox regression, I need to use newdata, for example:

model.2 <- coxph(Surv(start, stop, event)~ transplant + age, data= one)
covs <-data.frame(age=21, transplant=0)
summary(survfit(model.2,newdata=covs,type="aalen"))

With the newdata, it seems like I have to define every variable that I included in the model, but in my model, actually only the transplant is the time-dependent covariate, other covariates were put in there for controlling purpose. If I only put transplant, the model can't be wrong, if I don't use newdata, it seems like it's not the time-dependent model. So, under this situation, do you know a way to account for only transplant, while controlling other covariates: age, race, salary, etc., and also create K-M curve with log-rank test result? Or I have to put other variables in the newdata too, but do you know how to code all other variables? Thank you so much!

1 Answers1

1

If you build a Cox model with multiple covariates, then predictions from the Cox model need to be based on values of all those covariates. "Controlling for" those covariates means that the hazard estimates included those covariates. You need to specify their values in some way.

If the covariate values that you "control for" are constant in time and your main interest is in the associations of the time-varying covariate with outcome, your choice of the time-constant covariate values will only change the baseline hazard and survival curves, not the hazard ratios between values of the covariate of interest. So for illustration of your results, any reasonable choice will do.

I don't think that it will make much sense to construct Kaplan-Meier curves from your model predictions and perform log-rank tests on them, however. Those are for use with original data, not model predictions. You can perform statistical tests comparing model predictions under different scenarios instead, and illustrate those predictions with modeled survival curves.

Finally, be very wary of trying to make survival predictions by providing time-varying covariate values. There's a big risk of circular reasoning there, as the presence of a new time-varying covariate value at a point in time implies that there has already been survival up to that point in time. See the discussion on this page and its links before you proceed.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thank you for the reply. It seems multiple covariates make the model complicated. Can I just use 'ggsurvplot(fit=survfit(Surv(start, stop, event)~ transplant + age, data=one))' to create adjusted K-M plot, and use 'coxph(formula = Surv(start, stop, event) ~ transplant + age, data=one' to do the cox regression? Just like this link? – NewRUser Jul 02 '22 at 02:38
  • @NewRUser you would have to specify groups of age for the Kaplan-Meier plots, otherwise you get one curve for each unique value of age. The page you linked in your comment only had 4 values of the predictor. – EdM Jul 02 '22 at 02:52
  • I see what you mean, if I add more covariates in the model, K-M curves will display the number of curves of the covariates, and won't only display curves for transplant group. So I might consider only put transplant in the ggsurvplot(fit=survfit(Surv(start, stop, event)~ transplant, data=one)) because I only want the K-M plot to display adjusted curve for time-dependent variable. And add covariates when I perform cox regression. But the overall coding is correct, right? – NewRUser Jul 02 '22 at 03:10
  • @NewRUser with the survfit model you propose in the most recent comment there is no adjustment except for showing separate survival curves for transplant or not. If that's what you want, OK. – EdM Jul 02 '22 at 12:15