I am trying to understand when are Cox models still informative and useful even when the proportional hazards (HR) assumption is violated and came across this interesting answer. It includes a link to this nice paper:
Stensrud MJ, Hernán MA. Why Test for Proportional Hazards? JAMA. 2020;323(14):1401-2. https://jamanetwork.com/journals/jama/fullarticle/2763185
which in turn, references this paper:
Pak K, Uno H, Kim DH, Tian L, Kane RC, Takeuchi M, et al. Interpretability of Cancer Clinical Trial Results Using Restricted Mean Survival Time as an Alternative to the Hazard Ratio. JAMA Oncol. 2017;3(12):1692-6. https://pubmed.ncbi.nlm.nih.gov/28975263/
They discuss the issues of using the hazard ratio (HR) from a Cox model when the PH assumption is violated:
"The limitations concerning this summary measure have been discussed extensively in the literature. The validity of using the HR depends on the proportional hazards assumption, that is, the HR for 2 groups is constant over the entire study period. This assumption is rarely valid in practice and without this assumption, the resulting HR estimate is difficult to interpret."
Does this suggest that although a HR may not be easily interpretable as something meaningful from the model, it does not mean that the model is totally invalid? Lets assume we have not checked if stratification or interactions between time and your time-invariant covariate of interest have been explored.
If we are comparing two nested Cox models and the PH assumption does not hold for one (or both) of the models, does this mean AIC, likelihood ratio tests etc. are completely useless? Here, we are not interested in a specific HR estimate which may be uninterpretable (and possibly incorrect) if the PH assumption does not hold but are instead just interested to know if the inclusion of a variable improves overall model fit. For example, we are just interested if the inclusion of a variable (resid.ds in this example) improves overall model fit.
#hypothetical R setup:
library(survival)
fit <- coxph(Surv(futime, fustat) ~ age + ecog.ps + resid.ds, data = ovarian)
fit2 <- coxph(Surv(futime, fustat) ~ age + ecog.ps, data = ovarian)
anova(fit2, fit)
thanks