2

I try to calculate spline terms of a logistic regression to generate a linear predictor/ prediction formula for the model "Lymph Node Involvement (Cores)"

The source (https://www.mskcc.org/nomograms/prostate/pre_op/coefficients) states the forumula to calculate the spline terms sp1var and sp2var as follows:

enter image description here

I tried to calculate sp1var and sp2var using the published knots in R:

var=10 # PSA = 10 for example
knot1= 0.2
knot2=4.7
knot3=7.2
knot4=96.53

sp1var <- max (var -knot1)^3 - max(var-knot3)^3 * ((knot4 - knot1)/(knot4-knot3)) + max(var - knot4)^3 * ((knot3 - knot1) / (knot4-knot3))

sp2var <- max (var -knot2)^3 - max(var-knot3)^3 * ((knot3 - knot2)/(knot4-knot3)) + max(var - knot4)^3 * ((knot3 - knot2) / (knot4-knot3))

however, If I calculate the probability (according to Make prediction equation from logistic regression coefficients), I get a wrong result:

# define the Intercept
Intercept = -5.37368223

define the Coefficients

cAGE = 0.00906354 cPSA = 0.21239809 cPSAs1 =-0.00132481 cPSAs2 = 0.00356913 cGLE = 3.03232465 #for gleason grade 5 cCLI = 0.71055042 #for clinical stage 3+ cPOS = 0.05499551 # no. of positive cores cNEG = -0.11987793 # no. of negative cores

define predictors

PSA= 10 age=50 npos=10 # no. of positive cores nneg=10 # no. of negative cores

calculate the probability

z = Intercept + age * cAGE + PSA * cPSA + sp1var * cPSAs1 + sp2var * cPSAs2 + cGLE + cCLI + npos cPOS + nneg cNEG

exp(z)/(1 + exp (z))

result : 0.8962046

expected: 0.39 (https://www.mskcc.org/nomograms/prostate/pre_op)

Do I misinterpret the stated formulas?

captcoma
  • 215

1 Answers1

1

Max function needs to include 0

You should use the max function like

max(var-knot1, 0)

instead of

max(var-knot1)

Typo in the function

You need to use the 1st line elow instead of the 2nd line. (there is a difference in using knot3 vs knot4)

sp2var <- max (var -knot2)^3 - max(var-knot3)^3 * ((knot4 - knot2)/(knot4-knot3)) + max(var - knot4)^3 * ((knot3 - knot2) / (knot4-knot3))
sp2var <- max (var -knot2)^3 - max(var-knot3)^3 * ((knot3 - knot2)/(knot4-knot3)) + max(var - knot4)^3 * ((knot3 - knot2) / (knot4-knot3))

When you use this then the result will be the same.


This type of use of the maximum function means effectively

$$\max(x,0) = \begin{cases} x & \quad \text{if} \quad x\geq0 \\ 0 & \quad \text{if} \quad x<0 \end{cases}$$

and is a way to get these splines defined as a function of piecewise polynomials.

example

  • thank you very much, this is of great help. How could the probability difference be explained? Could it be because of a different rounding definition? – captcoma Oct 12 '20 at 07:53
  • 1
    @captcoma Those coefficients have a high precision, so a roundoff error seems unlikely to me. It might be possible that their online model uses slightly different coefficients. Either because the coefficients are newer or older, or because there is some error. – Sextus Empiricus Oct 12 '20 at 08:45
  • I adjusted max as described and it works for the example given above, therefore your post answered my question. However, with the same setting, but PSA=20, I get 0.99 (expected 0.56). Is there anything else that I am missing? Could this be conneced to the difference described above? – captcoma Oct 12 '20 at 11:43
  • 1
    @captcoma there was also an additional typo in your equation for sp2var – Sextus Empiricus Oct 12 '20 at 14:56