Is it possible to simulate event time T from a distribution that depends on two variables A and Z but the hazard function conditioning on A only follows a proportional hazards model?
-
4Does this answer your question? How to simulate survival times using true base line hazard function. For the effect of A you can just multiply the cumulative hazard function by the HR. For the effect of Z, you just need to estimate separate strata if there is no proportional hazard. – AdamO Feb 26 '21 at 16:47
1 Answers
Depending on what you're trying to simulate, you might want to consider a parametric proportional hazards (PH) model rather than a semi-parametric Cox PH model.
Problems with simulation from a Cox model arise from its restriction to providing information only at the event times in the original data set. That leads to time periods with exactly 0 hazard, which can be quite long without a massive data set. It also means that you have no direct information about survival after the last event time. A well designed parametric model can overcome those issues, although you still are depending on extrapolation for late event times.
Alternatively, depending on the purpose of your simulations, you could just decide to simulate from a Cox model up to some latest time of interest. That would be analogous to the restricted mean survival values that can be calculated from Kaplan-Meier and Cox models.
It's not completely clear how your variable Z comes into the event-time model, but simulation should be straightforward. Parametric PH models would probably be simplest, as they are based on well known probability distributions. If you insist on a Cox model, however, functions like survfit() in R can also generate survival curves as a function of time and covariate values. With specified values of $A$ and $Z$ and a survival function $S(t|A,Z)$, you can use sampling from a uniform distribution to sample from the cumulative distribution of event times $F(t|A,Z)=1-S(t|A,Z)$, as for any 1-dimensional probability distribution. You will, however, need to decide how to handle late event times if you use a Cox model for which the last time point isn't an event.
- 92,183
- 10
- 92
- 267