1

The cox time varying covariates (x(t)) model is as such:

enter image description here

The above formulation can be seen here: https://lifelines.readthedocs.io/en/latest/fitters/regression/CoxTimeVaryingFitter.html

Here, can anyone please let me know the following two things:

  1. What is '$\bar{x}$' in the above formula?

  2. How are the coefficients β being calculated?

Edit-1:

In method 1, we calculate the coefficient values that bring the first derivatives of the log partial likelihood with respect to the coefficients, the score function, to 0 as depicted in this book and shown in below image file:

enter image description here

In method 2, by using the Hessian matrix, the partial likelihood is maximized via Newton-Raphson algorithm. The inverse of the Hessian matrix, evaluates β as mentioned here and shown in below image file:

enter image description here

In method-3, partial likelihood is maximized via Nelder Mead’s algorithm to calculate β as mentioned here and shown in below attached file:

enter image description here

Can somebody please let me know what kind of optimization algorithm does this library use in Cox model to calculate β.

  • Unlike linear regression, centering the $X$ does not make sense in Cox or logistic models because the "intercept" (the baseline hazard function) would only correspond to prediction-at-the-means which is a biased estimate of the "average effect of 'x'" in the population. I would say the correct expression for the hazard function with time varying $x$ is $h(t|x(t)) = h_0(t) \exp( x(t) \beta )$ It wouldn't matter for your estimate of $\beta$, but it would if you smoothed the hazard function. It's just a strange decision to center X apropos of nothing. – AdamO Jan 30 '23 at 18:43
  • $\bar{x}$ would be the average $x$ value in the sample, which is up to interpretation. A TVC analysis reshapes the data structure to a "long" format. My choice of $\bar{x}$ would be the "baseline average $X$" assuming (hoping) the data are not left censored. A dumb choice for $\bar{x}$ would be the average $X$ in the long data structure, but I actually expect that that's what's conceptualized in the heuristic guide on this site. – AdamO Jan 30 '23 at 18:47
  • 1
    This addition to your question is really a new question, and this site prefers one question per page. Also, specifics of a particular software implementation are off-topic on this site unless they tie directly to a statistical issue. I'd suggest that you contact the package author directly if the source code isn't clear. See his user page on this site or this web page. – EdM Feb 06 '23 at 14:11
  • Thanks a lot @EdM. Will contact the package author. – NN_Developer Feb 07 '23 at 05:28

1 Answers1

1

If the proportional hazard assumption holds, then in principle the choices of reference or 0 values for predictors $x$ don't matter. You could re-write the formula you provided for the hazard as:

$$h(t|x(t)) = h_0(t)\exp(-\bar x' \beta) \exp(x(t)'\beta)= h_{0\bar x}(t) \exp(x(t)'\beta),$$

a constant multiplicative scaling of the original baseline hazard that will then work with the un-centered predictor values. Any re-centering of predictor variables will just mean a corresponding shift in the corresponding baseline hazard function, which isn't even directly evaluated by the Cox model.

In practice, the exponential can lead to numerical instability. The help page for the R coxph() function says:

The routine internally scales and centers data to avoid overflow in the argument to the exponential function. These actions do not change the result, but lead to more numerical stability.

I suspect that the lifelines implementation centers to avoid that practical problem, with the equation written to show that centering explicitly. I don't know whether it also scales internally.

The coefficients $\beta$ in Cox model are found by maximizing the partial likelihood of the data as a function of the coefficient values. This page shows the form of the partial likelihood and how it takes censoring into account. You solve by finding coefficient values that bring the first derivatives of the log partial likelihood with respect to the coefficients, the score function, to 0. This answer shows the form of the score equation for a Cox model, although $\bar x$ in that formula takes on a different meaning as a risk-weighted average of predictor values in place at an event time.

Modeling Survival Data: Extending the Cox Model by Therneau and Grambsch goes into extensive detail.

EdM
  • 92,183
  • 10
  • 92
  • 267