1

I have a question about the implementation of Cox Time Varying Regression model, that I need to perform to understand the impact of co-variates on my survival prediction.

I found an example here (https://lifelines.readthedocs.io/en/latest/Time%20varying%20survival%20regression.html) but I still have a doubt. My dataset contains some patients (so rows of the dataframe with the same ID) that are not studied continuously during the observation time, showing some "gaps"; so, for example, I have info like that:

  • ID: 1 , start: 0 , stop: 50 , event : 0
  • ID: 1 , start: 60 , stop: 80 , event : 0
  • ID: 1 , start: 80 , stop: 100 , event : 1

So, as you can see, the first two time intervals are not consecutive but have a gap (from 50 to 60). Do you know whether I can include these patients in my analysis or I have to remove them? I have this doubt because I've never seen this "situation" in the examples I found online.

Thank you in advance.

1 Answers1

0

If there can be at most one event per individual in a Cox model, then the gaps don't matter. Each time interval is treated as left-truncated at the lower end and, if there is no event, right-censored at the upper end. For such a model, you don't even need to keep track of patient IDs. In that case, as the R vignette on time-dependent survival models explains in Section 2:

this representation is simply a programming trick. The likelihood equations at any time point use only one copy of any subject, the program picks out the correct row of data at each time.

Thus an individual provides information during the time intervals for which you have data, and no information during the gaps.

The situation is different if there can be more than one event per individual in a Cox model, or in fully parametric models.

Be very careful, however, in building and interpreting models with time-varying covariate values. There is a big risk of survivorship bias; the lifelines package thus won't allow predictions from such a model.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Thank you very much for your reply. I actually didn't completely catch your last comment about the survivorship bias: do you think that the presence of gaps bias the model or it was a generic comment, unrelated to my initial doubt? – Valerio Pugliese Apr 03 '23 at 13:47
  • @ValerioPugliese that's a generic comment, made because I've answered several related questions on this site and time-varying covariates are often found to be troublesome. You might look at this answer from the author of lifelines and my generally supportive (if less dogmatic) answer on the same page. The original fitting of the model about which you inquired isn't at issue, even with gaps of time within an individual. It's interpretation and prediction that pose problems, even without gaps. – EdM Apr 03 '23 at 14:30