11

I´m modeling the effect of pregnancy on the outcome of a disease (dead-alive). Approx 40% of the patients did become pregnant after the time of diagnosis-but at different points in time. So far I´ve done KM plots showing a clear protective effect of pregnancy on survival and also a regular Cox model-however these have been modeled using only a dichotomised pregnancy variable and assuming the effect is present from the time of diagnosis which is clearly unrealistic since the median time to pregnancy is 4 years from diagnosis.

What kind of model would absorb the effect of multiple pregnancies at different time points after diagnosis? Would it be correct to model the pregnancies interacting with time (which would require some serious data reconstruction-any automated software that could help with this?) or is there another preferred modeling strategy for these problems? Also what is the preferred plotting strategy for these problems?

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
Misha
  • 1,323
  • interesting question (+1)... this recent paper might be of help: http://www.ncbi.nlm.nih.gov/pubmed/21328605 – ocram Nov 14 '12 at 07:25
  • Interesting-but I believe the main topic there is time varying effects.//M – Misha Nov 14 '12 at 07:55
  • time-varying effects is the topic of the paper... – ocram Nov 14 '12 at 08:00
  • 1
    This reminds me of the "classical" survival analysis example of the heart transplant data: http://bit.ly/UFX71v - what you need is a time varying covariate, not necessarily a time-varying coefficient. You can plot your data using KM curves. – boscovich Nov 14 '12 at 08:25
  • With this method you would also able to handle the fact that some women may have had more than 1 pregnancy during the follow-up. – boscovich Nov 14 '12 at 08:31
  • @andrea: you are right: that's the covariate that varies, not (necessarily) the associated parameter. – ocram Nov 14 '12 at 08:32
  • Take a look at Thereneau and Grambsch which covers many variations on survival analysis including multiple events. – Peter Flom Nov 14 '12 at 11:15
  • @andrea...Thx for the input. It is indeed a time-varying internal variable as defined in Kleinbaums book on survival analysis. However, I dont think KM would be appropriate for visualisation using a dichotomised pregnancy variable. The time outside of pregnancy, or before the first pregnancy, should not be allocated to the effect of pregnancy-Hence also the need for the extended cox model. – Misha Nov 14 '12 at 20:51
  • That's exactly the point of using time-varying covariates! A given woman, at a certain point, can cross over from the "pregnancy-free" group to the "pregnancy" group (and vice-versa!) and the KM will take care of these (potential) changes in the exposure group over time @Misha – boscovich Nov 15 '12 at 11:45
  • Just to be clearer: you don't have to use the time-varying covariate just for the Cox model. You can use it for the calculation of the Kaplan-Meier curve, too. The KM method is perfectly "able" to handle time-varying covariates (like pregnancy in your example or heart transplant in the book's example). @Misha – boscovich Nov 15 '12 at 11:57
  • As an example look at these 2 graphs based on the heart transplant data. http://i.imgur.com/NPZPa.png Top graph: KM curves calculated using time-varying covariates. Bottom graph: KM curves using a time-constant variable that was 0 if the patient did not receive the transplant and 1 if it did somewhere in time (basically, it's the "wrong approach" of analyzing this data). These 2 graphs don't prove anything. It's just to show you that KM curves can actually handle time-varying covariates. – boscovich Nov 15 '12 at 12:32
  • @andrea..thx a million..I was not aware you could do the same with km..if you write it up I'll accept it...and I just found the unfold command from John fox in r that will help with the data reconstruction – Misha Nov 15 '12 at 16:25

3 Answers3

5

What you need here is a time-varying covariate and not necessarily a time-varying coefficient. A known example that could help you with your analyses is the Stanford heart transplant data.

To present your results you can use the classic Kaplan-Meier estimator that handles time-varying covariates with no problems (remember, though, that this is a crude - or unadjusted analysis with all its well-known limitations).

As an example, the following graph shows the analysis of the Stanford HT data when correctly accounting for the time-varying transplant status (top panel) and without accounting for it (bottom panel).

enter image description here

boscovich
  • 1,676
  • I finally managed to do this and I get the following plot – Misha Dec 31 '12 at 09:24
  • Regular KM is NOT the proper way of graphing these models. Rather it is an extension to KM by Simon and Makuch that is implemented in Stata. http://stats.stackexchange.com/posts/46754 – Misha Jan 01 '13 at 00:19
  • You can not use the KM like this. Consider the pregnancies with e.g. age as underlying time: Let's say that women are at least 20 when they get their second child and at least 22 when they get their third. Let's assume constant hazard for all ages and all groups (number of children born). Then the 2- and 3-groups will die at the same rate, but the 3-group estimate will (most likely) be larger at any time t , simply because the 3-groups starts dying at a later age. This is a misrepresentation of data. – swmo Feb 19 '15 at 18:24
5

In R, this can be addressed with the start/stop version of a Survival object, e.g.

fit <- coxph(Surv(time1, time2, status) ~ is.pregnant + other.covariates, data=mydata)

This paper discusses this in more detail: http://cran.r-project.org/web/packages/survival/vignettes/timedep.pdf

1

Beware immortal time bias in this situation. Your pregnant group will inevitably have a better survival than the non-pregnant group since you can't become pregnant after you die (to the best of my knowledge!)

drstevok
  • 550