2

Assuming that I have a custom cross-entropy-like loss function defined as below, how can I prove that the loss function is classification-calibrated?

$$ L=-\frac{1}{n}\sum_{i=1}^n\sum_{j=1}^c w_j^{(1-p_{ij})} \; y_{ij} \; \log \, (p_{ij}) - p_{ij}(1-p_{ij}) $$

where $w_j$ is the class weight of class $j$, $y_{ij}$ is the $j^{th}$ element of one-hot encoded label of the instance $\textbf{x}_i$ and $p_{ij}$ is the predicted probability of the class $j$ of instance $i$.

Richard Hardy
  • 67,272
Dushi Fdz
  • 145

1 Answers1

2

Calibration is about how correct your predictions are, on average. As an example, say you want to predict whether tomorrow is going to rain or not. If your perfectly calibrated model tells you it is going to rain with 70% chance, it means that, every time you get such a score from your estimator, 70% of the times it will actually rain.

For the particular case of classification, you need to check whether the scores you get from your classifier, match the observed classification rates. That is, if you take all samples classified as positive with score 0.7, then the average accuracy for that set is close to 0.7.

You may plot those observations (calibration curves, see scikit for an example, and also calculate the Brier score (see wikipedia)

jpmuc
  • 13,964