How to interpret the schoenfeld residuals plot

Question

I want to check the proportional hazard assumption. I used both test using cox.zph() in R and schoenfeld residuals plot using hggcoxzph(). I want to know if the plot is fine and how I can interpret it. From the test p-value is not significant which is good, but how about the schoenfeld residuals plot?

Does this answer your question? How to interpret schoenfield residual plot You seem to be using a package that has a serious coding error in its display of these residuals, as the linked page demonstrates. Fix that problem as indicated, then examine the smoothed plot line for flatness. If it's reasonably flat, you're OK. Maybe better just to use the tools in the main survival package. You should be OK, given the fairly high p-value for cox.zph(). — EdM, Jun 27 '22 at 13:16
@user358238 thanks. But, is the plot above showing the PH assumption is not fulfiled? from the test it does not show violation, but I want to be sure about the scale residual plot? — user358238, Jun 27 '22 at 15:58
It's very hard to tell from that particular plot, because the coding error in ggplotcoxzph()makes the y-axis too wide relative to the actual points and the smoothed curve. Try repeating with the standard plot.cox.zph() function in the survival package, or modify the code as in my answer to the linked question. — EdM, Jun 27 '22 at 17:01
@EDM I have uploaded the library(survival), but I get: Error in plot.cox.zph() : could not find function "plot.cox.zph". — user358238, Jun 27 '22 at 19:42
Sorry for the confusion. plot.cox.zph() is the internal name. If you call plot() on an object returned by the cox.zph() function, the software knows to use the function with that internal name instead of all the other possible plot() functions. When the survival package is loaded you can find the manual page by typing ?plot.cox.zph at the command prompt, but you just call plot() yourself when you want to generate the scaled Schoenfeld residual plot. See Section 3.1 of the survival vignette. — EdM, Jun 27 '22 at 21:21
@EdM thanks please see the updated above from plot.cox.zph() . Could you help then with the interpretation and if the PH assumption is fine? — user358238, Jun 28 '22 at 11:42

EdM · Answer 1 · 2024-01-05T15:47:54.900

The proportional hazards (PH) assumption might be OK, but the plot suggests that you need to think carefully about your model. You will have to use your knowledge of the subject matter to make your decisions.

First, do NOT depend on ggcoxzph(). Its plot has extremely wide y-axis limits and improperly drawn confidence limits for the smoothed curve. This probably is related to a long-standing (and evidently still uncorrected) coding error described here. Furthermore, it seems to have cut off values of time beyond about 79. The plot produced by the survival package tools shows that many of your data points are beyond there.

Second, the time transformation used by cox.zph() (note the non-linear spacing of tick marks along the Time axis) has pushed together those late-time events so that most of the plot emphasizes the time range from 71 to about 80. The default Kaplan-Meier time transformation helps minimize contributions from outliers in the usual clinical setting, where there are usually very few and widely spaced event times at late times. In your case, a high proportion of events seems to be at late times. A different time transformation (e.g., identity) might have given a "significant" departure from PH. There does seem to be a dip in the smoothed curve at the later times. Based on your understanding of the subject matter, you and your colleagues have to decide whether the default time transformation is appropriate for your data and whether that apparent dip at late times is big enough to matter. It also seems that there are many tied event times, so make sure that your choice of how to handle tied times was OK for such data.

You also need to consider the magnitudes of the scaled Schoenfeld residuals. They are hard to interpret without further information about the model, as they start with the differences between the covariate values for an individual having an event and the risk-weighted average of the corresponding covariate values for those at risk at that time, then scale the differences by the coefficient covariance matrix and the number of events. See this page. Large-magnitude scaled residuals could be something as simple as a large number of events. See this page. It's best to focus on the shape of the smoothed residual plot instead of on the individual scaled residuals.

thanks for explanation. The ties is I guess because in coxph(Surv)) I have added ties = "breslow". — user358238, Jun 28 '22 at 13:22
@user358238 With a lot of ties you are generally better off using the Efron correction, which is the default in coxph() (although not in some other software). — EdM, Jun 28 '22 at 13:37

How to interpret the schoenfeld residuals plot

1 Answers1

Linked