0

The question: I'm wondering if anyone knows of a way to obtain a confidence interval on a probability estimates obtained from a model (e.g., from a logistic regression model or a neural network) in a binary class prediction setting (i.e., the model only outputs a probability prediction that is then used to decide whether the predicted class will be positive or negative)?

Some additional information: Conformal prediction provides a way to obtain a set of predictions $\tau(\mathbf{x})$ for a test instance $\mathbf{x}$ in a multi-class classification setting, where the prediction set has a probability of 1-$\alpha$ of containing the true class label. The process of obtaining this set involves a so-called calibration set $D = \{(\mathbf{x}_1,y_1),\dots,(\mathbf{x}_n,y_n)\}$ and a conformity measure $S$. Using $D$ and $S$, a value $\hat{q}$ is found such that the conformity score for each class $k=1,\dots,K$ obtaine from a model $f$ on a test instance $\mathbf{x}$ produces the prediction set with a $1-\alpha$ confidence. I.e., $\tau(\mathbf{x}) = \{k|S(f(\mathbf{x})_k) \geq \hat{q}:k=1,\dots,K \}$.

The problem: Conformal prediction provides a class prediction set and is most useful in a multi-class classification setting. I'm wondering if there's a way to take the idea of conformal prediction (or some other idea if necessary/applicable) and to, instead, obtain a confidence interval on a predicted probability?

Put differently (and this may be why this "sort of" a weird ask): I want a confidence interval on a probability estimate such that the probability interval contains the "true probability" with 95% confidence. I know this is a weird question because, in practice, we never observe "true" probabilities, but only the known outcomes -- i.e., we only observe when the probability becomes 100% when the even occurs, and only observe that it has "not yet occurred" in the case of the negative class samples.

DMML
  • 123
  • 4
  • 1
    Does the following answer? https://stats.stackexchange.com/questions/354098/calculating-confidence-intervals-for-a-logistic-regression, https://stats.stackexchange.com/questions/539702/logistic-regression-how-to-compute-a-prediction-interval – kjetil b halvorsen Feb 27 '24 at 02:31
  • @kjetilbhalvorsen Thank you very much for the response. Those links are indeed very helpful. This is what I'm looking for. However, I'm also wondering whether this can be done for any arbitrary model that outputs probability estimates? – DMML Feb 29 '24 at 19:47

0 Answers0