1

As the title suggests, I am running a Random Forest classifier using Scala. To evaluate this classifier (and since I am handling highly imbalanced classes), I used the BinaryClassificationEvaluator library. The area under PR is >0.5 but when I print the confusion matrix, it looks like my recall and precision are 0 (I have 0 TP predictions).

Is this mathematically possible?

Sycorax
  • 90,934
Toutsos
  • 157

1 Answers1

3

The confusion matrix just looks at one threshold, but the PR curve looks at all thresholds. Suppose your classifier gives predictions 0.48 to all negative and 0.49 to all positives. Using the classification rule "If prediction > 0.5, positive else negative", you'll have 0 TP predictions. Something similar is happening here.

Sycorax
  • 90,934