3

I have read in many places that power in survival analysis depends on the number of events rather than the total number of observations (events + non-events).

Suppose I have 10 events and 90 non-events and I want to test for survival differences between two groups: High Risk and Low Risk as classified by a diagnostic test.

Then suppose I add in 100 non-events, which, with my reasonably good diagnostic test, are more likely to be classified as low risk. Wouldn’t this increase the effect size (hazard ratio for High Risk v Low Risk), and therefore the power, without changing the number of events?

Scarper
  • 71
  • What do you mean by "non-event"? Survival analyses are for data collected as the elapsed time until a one-time event. For some people/animal/object, that event may not have occurred by the time of collection. Survival analysis accounts for these censored observations. But "censored" means the event had not happened by time of data collection, or you don't know if it happened past a certain time point, or maybe it happened but the experimental protocol wasn't followed so the data must be censored. But "non-event" doesn't really enter into survival analysis.

    Details on experimental design???

    – Harvey Motulsky Feb 26 '17 at 17:25
  • I'm analysing a cohort study where people were followed for 10 years. By "non-event" I mean that they were not dead at ten years – Scarper Feb 26 '17 at 17:43

2 Answers2

6

Adding in censored ("non-event") cases does not improve power in terms of precision of estimating regression coefficients/hazard ratios in Cox regression. This paper, Hsieh and Lavori, Control Clin Trials 2000;21:552–560, for example, and the papers that it cites demonstrate that fact.

The issue is that the calculations relating predictor variables to outcome are only performed at the times of events. Censored cases thus don't help relate survival to predictors in Cox regression. Censored cases might help refine the shape of the baseline hazard, and thus do add useful information, but they don't add to the precision of the hazard ratios that act on that baseline hazard.

With a fully parametric survival model, censored cases do make contributions to the likelihood and thus can increase power in terms of more precise coefficient estimates.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • You said the censored cases don't improve the precision of the regression coefficients; can they change the magnitude of the coefficients/hazard ratios? Would a general conclusion be that censored cases can change the effect size between groups you want to compare, but won't make it more likely that the effect is significant? – Scarper Mar 01 '17 at 15:07
  • @Scarper that is how I understand it. The censored cases do contribute to the hazard and to the relation of hazard to the predictors at each event time, but the censored cases don't increase the effective N for the regression, which is the number of events. This is one of the more depressing aspects of survival analysis; it's the number of people who don't survive that matters. – EdM Mar 01 '17 at 16:06
  • 1
    This claim bothers me. In current status data, all events are right or left censored and we can fit a semi-parametric Proportional Hazards regression models to them. I'm trying to determine how both could be true. One explanation would be that we fit the current status PH model using the full likelihood rather than partial. This also suggests we may get significantly more power from using the full likelihood rather than the partial likelihood if we have a heavily censored dataset? – Cliff AB May 15 '22 at 15:48
  • @CliffAB It can be frightening to come back to an answer 5 years later, as I did today to update a broken link. I think that the full versus partial likelihood distinction is key here; my original answer was focused on Cox regressions, while the question was perhaps more generic. Even if censored cases don't increase power for distinguishing differences in regression coefficients, they increase the precision of baseline-survival estimates from a Cox model. They clearly contribute to likelihoods in fully parametric models. I've expanded a bit on that distinction. Thanks. – EdM May 15 '22 at 16:06
1

What you are calling a "non-event" is a person/animal/object with a very long survival time, so long that the event hasn't occurred at the time of data collection. These individual provide lots of information, so of course affect the power of the study. They are not missing values. Rather, they have a very long survival time.

  • That is not necessarily true. If a subject is right censored at $t = 1$ and typical event time is $t = 100$, we have very little information of whether they had a long survival time. – Cliff AB May 15 '22 at 15:03