4

I am quite new in the spline subject and I have a question!

I am using a Cox model and I was afraid that some of the variables included in the model have a non-linear effect on survival. So I tested each variable for its linear effect using the rcsfunction from the rmspackage with 3 knots. This was very easy. The knots are located at pre-specified quantiles. Then, I was doing a graph of Predict(fitted model, tested covariate). And if the variable was not too twisted on the graph, I was considering that the use of spline was not necessary in this case.

My suspervisor recommanded me to also make a likelihood test with each variable in the following way:

 fit0 <- coxph(Surv(time, status) ~ age, data = stanford2)
 fit1 <- coxph(Surv(time, status) ~ pspline(age, df = 3), data = stanford2) 
 pred1 <- predict(fit1, type="terms", se=TRUE)
 anova(fit0, fit1)

I noticed he used pspline rather that rcs. And I wanted to use rcs also for the likelihood tests he suggested.

Question

Can we consider that rcs(..,knots = 3) can be used in a equivalent manner to pspline(..., df = 3)in my case ? Or one of the two methods should be preferred for the graphs and likelihood tests ?

Thanks a lot for your help :-)

Flora Grappelli
  • 583
  • 3
  • 18

1 Answers1

2

The rcs() and pspline() functions are two different ways to implement splines for regression models. Either is OK; they just take different approaches to constructing the splines.

The rcs() function implements what's called a restricted cubic spline. You specify "knot" positions along the range of the predictor. The function then fits smoothly-joining cubic polynomials between the inner knots, with restriction to linear extensions beyond the outer knots.

The pspline() function instead uses a type of "smoothing" spline. That function came early in development of the survival package, as Therneau and Grambsch noted (Section 5.8) that such splines could be fit in a Cox model similarly to other "penalized" predictor coefficients. In that implementation, you start with a set of multiple "basis" functions "evenly spaced and identical in shape" (Section 5.8.3) along the range of predictor values. The roughness of the fitted curve is adjusted via penalties on the coefficients associated with those basis functions.

You don't need to fit two separate models, one with and one without the spline, to evaluate linearity if you are willing to use Wald tests. The output from a coxph() model using pspline() directly reports the significance of the combined non-linear terms. With the way that regression splines are implemented in rcs(), there is a combination of linear and non-linear terms, and (at least with the cph() extension of coxph() in the rms package) you can evaluate the significance of the combined non-linear terms.

This page, among others on this site, introduces the relative advantages of these and related spline methods for regressions of all types.

EdM
  • 92,183
  • 10
  • 92
  • 267