The final step of running a Cox model I've been working on involves performing log-likelihood ratio tests to check the significance of each predictor to the model. I'd like to do this using anova(full_model, model_minus_one_variable) for each variable.
However, I am unable to do this because I have used robust = TRUE in the cox.zph function (i.e., robust standard errors). My decision to do so was basically entirely driven by this post (as linked in the Python lifelines package) since it suggested I'd be better protected from PH violations.
My questions:
Would it be appropriate to run my final model with
robust = FALSEjust for the purpose of finding out which variables are significant or is this bad statistics?If the answer to the above question is no, it seems a possible solution could be to manually calculate the significance of variables in the model by log-likelihood ratio tests (instead of using
anova) as here. Is this an appropriate alternative?Was my decision at the beginning of the modeling process to use robust SE justified? My results - namely coefficients and p-values - for continuous variables with
pspline(all of my continuous variables) are very different with and without robust SE. This appears not to matter too much because 1) the HR is still the same and 2) log-likelihood is also comparable. Since I intend to investigate significant variables based on likelihood, I think it won't matter. However, it's probably obvious from this post that I'm confused and doubt my own judgement.
Additional info that may be helpful to the questions: N.o. observations = 100k with 1655 events, main objective is to investigate a continuous treatment on mortality, with 48 other covariates. psplines() used on all continuous variables due to evidence of nonlinearity and to account for non-proportional hazards. Strata terms added on 8 categorical vars due to nonPH. Total df = 138.
rmspackageanovafunction that works withcphobjects (cphis a front-end tosurvival::coxph). – Frank Harrell Sep 14 '22 at 16:00anovafromrmsfor this purpose, but because I've usedpsplines()I can‘t. I initially tried usingrcs()but had less PH issues withpsplines()so went with that one in the end. The incompatibilities between different methods sometimes makes things tricky. Going to redo the analysis now without robust SE and see how things change. – JED HK Sep 15 '22 at 07:00cox.zphhas an option that will show "terms". Overall assessment forpsplinesandrcsshould be about the same. – Frank Harrell Sep 15 '22 at 11:13terms = TRUE). I still saw quite some differences between PH inrcs()versuspsplines()(more PH violations with the former).psplines()seems to be working well - I'm going to continue with that. Also maybe of interest, I redid analysis withrobust=FALSE, the main difference I see so far is that pairwise differences in several categorical covars are now no longer significant. Not a big deal (this is not my primary outcome of interest) but I am also not sure how to interpret this finding – JED HK Sep 15 '22 at 11:49rcsspline matches that of thepsplinesspline. – Frank Harrell Sep 16 '22 at 12:20psplines()andrcs()discrepancy, I guessed it was something to do with knot placement (which I know arguably shouldn't have a large influence but I dont know how else to explain it). I useddf=3onrcs()and the default (4) onpsplines()originally, but have since cut down todf=2on most vars to save on df.... – JED HK Sep 16 '22 at 13:28rsc()– JED HK Sep 16 '22 at 13:39