0

I am wondering if the loglog curve and the Cox zph test are different then what should I choose? I am working on the cgd data set. I conclude that the sex is time dependent from loglog plot but is not time dependent on zph. One reason I could think of is that the sex are of different sizes, that is, male is around 35 but female is around 168. That is a large difference. What should I do in this case?

User1865345
  • 8,202

1 Answers1

0

For me, the most useful evaluations of the proportional hazards (PH) assumption are the smoothed plots of scaled Schoenfeld residuals over (transformed) time, which can be obtained from the object returned by the cox.zph() function.

Stratified log-log plots (in your case, based on sex) typically don't account for the associations of other covariates with outcome. An apparent (possibly lack of) proportionality might not hold if other covariates could be taken into account. (I personally can have a lot of trouble interpreting such plots, anyway.)

The text output from the cox.zph() function, on the other hand, is effectively a test of trend over time, which can miss some forms of time-varying failure of PH. That might be what is going on in your situation.

library(survival)
cox1 <- coxph(Surv(tstart, tstop, status) ~ sex + treat + inherit + steroids + cluster(id), data=cgd)
(zph1 <- cox.zph(cox1))
#          chisq df    p
# sex      0.122  1 0.73
# treat    0.518  1 0.47
# inherit  0.525  1 0.47
# steroids 0.137  1 0.71
# GLOBAL   1.121  4 0.89
plot(zph1[1])

Scaled Schoenfeld residuals for sex

That plot shows a good deal of waviness in the estimated coefficient for sex over time, even though the chi-square test shows no overall trend. You would then need to decide if that possible lack of PH is substantial enough to matter for your application. In this particular data set, if sex wasn't of direct interest, you could deal with the PH problem by stratifying on sex in the model.

EdM
  • 92,183
  • 10
  • 92
  • 267
  • Hello EdM, Thanks for your response. Actually when I selected models using AIC I did not use sex as a covariate, and I used some other covariates. Then I believe it is not necessary to stratify on this, right? – user374070 Nov 28 '22 at 01:21
  • @user374070 if a covariate isn't associated with outcome then there's no need to include it, whether as a modeled predictor or for stratification. In these data, sex isn't strongly associated with outcome. It's generally best to include as many outcome-associated predictors as reasonable (without overfitting) in a Cox model, as they have omitted-variable bias similar to logistic regression. The art of survival modeling includes figuring out how many/which predictors to include, and how. – EdM Nov 28 '22 at 01:56
  • Hi EdM, My goal is to use the stratified Counting Process Approach. However, I also want to select all the possible covariates using AIC. Is it possible to use stepAIC to select this kind of model with a stratified term? – user374070 Nov 28 '22 at 02:14
  • 1
    @user374070 it's possible, but unwise. See this page for extensive discussion of the perils of automated model selection. Frank Harrell's course notes and book discuss more reliable strategies (more likely to generalize to new data), particularly in Chapter 4. AIC can play important roles in reliable model building, but you shouldn't depend on it alone for automated model selection. – EdM Nov 28 '22 at 14:18
  • And pay a lot of attention to the multiplicity-adjusted GLOBAL row computed from cox.zph. – Frank Harrell Oct 02 '23 at 12:37