How damaging to the analysis would it be to run probability validation (`rms::val.prob`) when calibration (`rms::calibrate`) is the correct action?

Question

If I make a model that predicts probabilities (e.g., logistic regression or a neural network), I would like it to have the property that, when it predicts a probability of $p$, the event happens about $100p\%$ of the time. That is, I would like calibrated probabilities.

If I do my work in Frank Harrell’s rms R package, then I can just stick my probability predictor into the calibrate function and get a nice graph that illustrates how truthful the predicted probability values are.

However, if I do not create my model in that package, Harrell’s function need not apply to the model object from another package, and so I cannot run the calibration function on my model. Further, even if I write my own version of the function that will apply to the model object I have created, if I have a gigantic model that takes hours or days to train, the bootstrap procedure in rms::calibrate will take longer than I might be able to tolerate.

There is a related function, rms::val.prob, that checks if claimed probabilities are aligned with observed outcomes. Unlike calibration, probability validation does not consider the model-building procedure, so if I created my probabilities using a neural network that takes days to train, all I have to do is input the predictions and observed outcomes, rather than training a new model many times. This could be a substantial saver of computing time, and even if the computing time would be quick, I do not have to write (and test…) my own software to operate on a regression object from outside of rms.

The “correct” approach is to use calibration in this situation: the predicted probabilities come from a model. However, what is lost by applying the quicker probability validation?

Related Questions of Mine

Walk through calibration

Walk through probability validation

EDIT

As is discussed in the comments, I think the crux here is the optimism bootstrap in the calibration function and what we lose when we apply probability validation instead of retraining models (as is performed in the calibration function).

The main purpose of val.prob doesn't seem to be the recalibration, but rather reporting how poorly calibrated the original model is; but it does optionally return a Platt/logistic and nonparametric/lowess recalibration fit. You should generally not fit a calibrator on the same data as the model was fit on, but you can manage that in rms; is that the question, or even after using a separate dataset are you asking about the benefit of the optimism bootstrap parts of calibrate? — Ben Reiniger, Nov 02 '22 at 14:28
@BenReiniger I have heard not to calibrate on your same data but hadn’t been thinking of that, so thanks for the reminder. Anyway, yes, I think the question is about the optimism bootstrap in the calibration function and what we lose when we apply probability validation instead of retraining models. — Dave, Nov 02 '22 at 14:31

score 2 · Answer 1 · answered Jan 28 '23 at 14:30

The manual page for the val.prob() function in the rms package says:

Given a set of predicted probabilities p or predicted log odds logit, and a vector of binary outcomes y that were not used in developing the predictions p or logit, val.prob computes the following indexes and statistics:...(Emphasis added).

So the choice might be better phrased in terms of internal versus external validation, as Harrell discusses here and in Section 5.3 of Regression Modeling Strategies. The val.prob() function would be OK for external validation but not for internal validation. Any proper internal validation method will be based on resampling. If resampling is too costly, then the tradeoffs are those involved with train/test data splits.

So if I’m going to do probabilities validation of my neural network (or whatever model I develop in another package), at least do it on a holdout set? — Dave, Jan 28 '23 at 16:18
@Dave that would be the principled way to proceed if resampling isn't possible. If resampling is possible, the optimism bootstrap isn't that difficult to implement even if your model isn't built in rms. — EdM, Jan 28 '23 at 16:56

How damaging to the analysis would it be to run probability validation (`rms::val.prob`) when calibration (`rms::calibrate`) is the correct action?

1 Answers1

Linked