If I make a model that predicts probabilities (e.g., logistic regression or a neural network), I would like it to have the property that, when it predicts a probability of $p$, the event happens about $100p\%$ of the time. That is, I would like calibrated probabilities.
If I do my work in Frank Harrell’s rms R package, then I can just stick my probability predictor into the calibrate function and get a nice graph that illustrates how truthful the predicted probability values are.
However, if I do not create my model in that package, Harrell’s function need not apply to the model object from another package, and so I cannot run the calibration function on my model. Further, even if I write my own version of the function that will apply to the model object I have created, if I have a gigantic model that takes hours or days to train, the bootstrap procedure in rms::calibrate will take longer than I might be able to tolerate.
There is a related function, rms::val.prob, that checks if claimed probabilities are aligned with observed outcomes. Unlike calibration, probability validation does not consider the model-building procedure, so if I created my probabilities using a neural network that takes days to train, all I have to do is input the predictions and observed outcomes, rather than training a new model many times. This could be a substantial saver of computing time, and even if the computing time would be quick, I do not have to write (and test…) my own software to operate on a regression object from outside of rms.
The “correct” approach is to use calibration in this situation: the predicted probabilities come from a model. However, what is lost by applying the quicker probability validation?
Related Questions of Mine
Walk through probability validation
EDIT
As is discussed in the comments, I think the crux here is the optimism bootstrap in the calibration function and what we lose when we apply probability validation instead of retraining models (as is performed in the calibration function).
val.probdoesn't seem to be the recalibration, but rather reporting how poorly calibrated the original model is; but it does optionally return a Platt/logistic and nonparametric/lowess recalibration fit. You should generally not fit a calibrator on the same data as the model was fit on, but you can manage that inrms; is that the question, or even after using a separate dataset are you asking about the benefit of the optimism bootstrap parts ofcalibrate? – Ben Reiniger Nov 02 '22 at 14:28