0
cox.zph(coxph(Surv(time, DEATH_EVENT) ~ age+anaemia+creatinine_phosphokinase+strata(ejection_fraction)+
                serum_sodium+serum_creatinine+hypertension, data=HF))

What exactly am I doing when I am applying strata() to ejection_fraction?

From some research, I've found that stratifying simply creates groups on your data based on the variable that you stratified.

I'm mostly wondering why it is that the feature is no longer included in my model summary.

Like, is ejection_fraction still a significant feature with a summary that says:

                         exp(coef) exp(-coef) lower .95 upper .95
age                         1.0467     0.9554    1.0271     1.067
anaemia1                    1.6510     0.6057    1.0375     2.627
creatinine_phosphokinase    1.0003     0.9997    1.0001     1.001
serum_sodium                0.9658     1.0354    0.9157     1.019
serum_creatinine            1.3940     0.7173    1.1267     1.725
hypertensionPresent         1.9803     0.5050    1.2555     3.124
Antonio
  • 543
  • 1
  • 9

1 Answers1

1

Stratification on a categorical variable allows for different baseline hazards associated with levels of that variable. If baseline hazards over time differ for levels of the variable, the proportional hazards assumption no longer holds for that variable. A Cox model returns no coefficient for it.

As ejection_fraction is a continuous variable, categorizing it to model or stratify isn't a good idea in the first place. You would be better served by modeling it flexibly and continuously, for example with a regression spline. Proper modeling of a continuous variable can also sometimes remove apparent violations of the proportional hazards assumption for the variable. See this page.

EdM
  • 92,183
  • 10
  • 92
  • 267