0

I would like to know how to calculate the Cox-Snell residuals in Cox regression, in case there are multiple deaths per time, using Efron's approximation. In particular, how does the calculation of the "estimated cumulative hazard and survivor functions" vary using the Efron approximation?

User1865345
  • 8,202

1 Answers1

0

Ties complicate the original estimation of regression coefficients in a Cox model. The Efron approximation is one way to simplify the partial likelihood calculations when there are ties in event times.

Once the regression coefficients $\beta$ have been estimated in a Cox model, ties do not pose a further problem for the estimate of the cumulative hazard, $\hat{H}(t)$, that forms the basis of Cox-Snell residuals or survival functions. Estimation of cumulative hazard is separate from the estimation of regression coefficients.

The survival function in a continuous-time model for individual $j$ having a set of covariate values $x_j$ is estimated by:

$$\hat S (t|x_j) = \exp[-\hat H(t|x_j)]. $$

If that individual had an event at time $T_j$, the Cox-Snell residual is $r_j= \hat H(T_j|x_j)$.

For a proportional hazards model, the cumulative hazard function for that individual simplifies to $\hat H(t|x_j)=\hat H_0(T) \exp(\beta' x_j)$, the product of an estimated baseline hazard shared by all individuals and the hazard ratio associated with the individual's covariate values. Thus in a Cox proportional hazards model, the key is the estimation of the baseline cumulative hazard.

This page shows that the baseline cumulative hazard over time for a Cox model is estimated via:

$$\hat{H}_0(t)=\int_0^t\frac{dN(s)}{\sum_i e^{\beta' x_i}Y_i(s)}$$

Here, $dN(s)$ is the increment in event count at time $s$, the sum is over all individuals $i$ with $Y_i(s)=1$ if the individual is at risk at $s$ and 0 otherwise, and $x_i$ is the set of covariate values for each individual, corresponding to the coefficients $\beta$. For this calculation it doesn't matter if there are multiple events at time $s$, as they just increase the value $dN(s)$ accordingly.

So the Efron approximation only affects the original estimates of the regression coefficients $\beta$. Once those are determined, ties only determine how much the cumulative hazard estimate is increased at each event time.

EdM
  • 92,183
  • 10
  • 92
  • 267