I am modelling differences between the risk of getting chronic kidney disease (CKD) between two groups, group A and B. I have very long follow-up time, approx. 30 years. The trouble is that CKD diagnosis rates have changed dramatically over these years, from being very rarely diagnosed to being quite commonly diagnosed. This change has also occurred at different rates in different age groups, meaning that some age groups received more intensive diagnosis earlier than other age groups. The relationship between age, calendar year, and baseline risk of diagnosis is therefore highly complex. Since my groups are slightly different in age and calendar time at recruitment, I need to adjust my analysis for this.
I am using a Cox proportional hazards model with extension for time-varying covariates. I have calendar year on the time axis and adjust for age as a spline function. The challenge is that the shape of the association between age and baseline hazard is not the same for all calendar years. Thus I started thinking about the interaction between age and calendar year. The final model therefore looks like this (to allow for different HRs above and below age 65 I have a separate interaction term for this):
$H(year_{in}, year_{out}) \sim group + group:age_{65} + spline(age_{in}):spline(year_{in}) + Z$
This model works splendidly as far as I can judge. It gives sensible estimates, it does not violate the proportional hazards assumption, and does not condition on survival to a certain age, as far as I can see. However, I have not found any sources (including here on CrossValidated) describing benefits or disadvantages of interactions between splines. So my question is:
Are there known problems with interactions between spline functions? I imagine this intuitively as having a surface in the three-dimensional age*year*baseline hazard space.
(Side note, initially I used age on the time axis. However, I noticed that the HR between group A and B was different at different ages, and thus ended up stratifying models on attained age. This, I realised, was not ideal, as the higher-age strata essentially became conditioned on survival until that aged. In the actual model I instead include both a group identifier and a group*age_group interaction, to essentially get two different HRs for the group at two different ages.)
EDIT: Clarified equation and removed second question (2. Is it a problem to use year in the function (although not as its own term, that wouldn't make sense), given that the model already has year on the time axis?) as it has been discussed here.