I am running a cox regression for survival analysis using the coxph() function in R for a very large dataset. My model is set up as:
Surv(time,event) ~ age + race + sex +...
The study period is 1 year. We found that age had a different effect in the first six months (phase 1) than it did in the last 6 months (phase 2). Age's effect being time dependent violates our PH assumption. To account for this, we thought of adding a time-dependent interaction term. The term is:
Agephase2 = (if time < 6 months ~ 0, if time >= 6 months ~ age)
The idea was to add this in the model as follows:
Surv(time,event) ~ Age + Agephase2 + race+ sex+...
Where the Age coefficient would represent the effect of age in phase 1, while Agephase2 would represent the effect of of age in phase 2. However, the output of our model is showing very large effects that are opposite than we anticipated, and the effects of all other covariates are severely altered.
Is there something wrong with this approach? How else could we account for time-dependent covariates in our model, where their effects differ in two periods? Should we just create two different models? When I researched time-dependent covariates for cox models, I didn't find much and what I did find was pretty complex.
agegroupchanged; perhaps there was some re-setting of the reference when you restructured the data. If you have actual ages those would be preferable toagegroupanyway, and you could modelagewith a regression spline so that the data could tell you the association betweenageand outcome without arbitrary boundaries between age groups. See this page and its links. – EdM Apr 17 '23 at 19:44