If the AUC score is 100 percent can the F1 value be 99.94 percent? I would expect 100 percent, too.
-
2Does this answer your question? What are the differences between AUC and F1-score? – Marjolein Fokkema May 12 '22 at 17:21
-
@MarjoleinFokkema unfortunately not – Peter May 12 '22 at 17:22
-
2I hoped it would be of help because the accepted answer states that AUROC summarizes the quality of a range of different thresholds, and F1 gives the quality of a single threshold on predicted values. – Marjolein Fokkema May 12 '22 at 18:09
1 Answers
$AUC$ measures the separability of the probability outputs of your model. If the positive group's lowest probability of being positive is less than the negatives group's highest probability of being positive, then you will achieve $AUC=1$.
However, calculating $F_1$ requires you to apply a particular threshold (and only that threshold). Software usually picks that threshold to be a probability of $0.5$. The two groups need not be separable at $0.5$. It could be that every probability value (across both groups) exceeds $0.5$ or is lower than $0.5$.
Consequently, there should not be any expectation that threshold-based metrics be perfect when $AUC=1$.
If $=1$, then any threshold between the highest value for the negative group and the lowest value for the positive group results in classifying everything correctly, so all threshold-based metrics (at such a threshold) should be perfect (on the data set that generated the $=1$, not necessarily in general if you use additional data).
- 62,186
-
1But of course, it should be possible to pick a threshold such that the F1 score is 1 right? – Sextus Empiricus May 12 '22 at 17:26
-
1@SextusEmpiricus If $AUC=1$, then any threshold between the highest value for the negative group and the lowest value for the positive group results in classifying everything correctly, so all threshold-based metrics (at such a threshold) should be perfect (on the data set that generated the $AUC=1$). – Dave May 12 '22 at 18:11
-
-
-
-
1@Peter To accomplish what? To make a model with a certain $AUC$ score higher on $F_1$ than a competitor that has a lower $AUC?$ Those don’t have to move together. They are different metrics for a reason, and they don’t even evaluate same model, since $F_1$ requires a threshold to be chosen. – Dave May 13 '22 at 17:45