3

I am running a binary classification with a random forest via the ranger package in R, and am using the pROC package for building an ROC curve and calculating AUC. But, the using the coords() function returns only three rows, and two of them are either -Inf or Inf.

> coords(rf_roc)
  threshold specificity sensitivity
1      -Inf       0.000       1.000
2       0.5       0.692       0.806
3       Inf       1.000       0.000

My question is: Is this normal, or does it indicate that there is an issue in my model/data?

For reference, I've included the confusion matrix results:

Confusion Matrix and Statistics
      Reference

Prediction 0 1 0 211 72 1 94 300

           Accuracy : 0.755              
             95% CI : (0.721, 0.787)     
No Information Rate : 0.549              
P-Value [Acc > NIR] : <0.0000000000000002

              Kappa : 0.502              

Mcnemar's Test P-Value : 0.103

        Sensitivity : 0.806              
        Specificity : 0.692              
     Pos Pred Value : 0.761              
     Neg Pred Value : 0.746              
         Prevalence : 0.549              
     Detection Rate : 0.443              

Detection Prevalence : 0.582
Balanced Accuracy : 0.749

   'Positive' Class : 1

Ben Reiniger
  • 4,514
  • It is a tad unusual, because realistically if we are using a probit or logit link function as most methods implementing probabilistic classifiers do, $\geq 0$ should suffice but to play it safe using $\infty$ is OK too. ROC is not defined only for classifiers afterall. – usεr11852 Jan 26 '23 at 22:38

1 Answers1

4

It makes sense. The only way to be assured of always getting perfect sensitivity is to classify everything as the positive class, which is equivalent to classifying as positive if the predicted value exceeds $-\infty$. Likewise, the only way to get perfect specificity is to classify everything as the negative class, which is equivalent to classifying as negative if the predicted value is below $+\infty$. While you might be able to get perfect sensitivity by setting the threshold just below the lowest prediction or perfect specificity by setting the threshold just above the highest prediction, that only applies to your particular data and, in general, cannot be assured of giving perfect sensitivity or specificity. Thus, it makes sense that the developers would hard-code the $\pm\infty$ in the coords function, as those are the only thresholds that will assure perfect sensitivity or specificity (granted, at the expense of each other).

I even got the Inf for some simulated data. That there is perfect sensitivity and specificity for finite thresholds is addressed below in EDIT 2.

set.seed(2023)
N <- 20
p <- rbeta(N, 1/2, 1/2) # simulated probability predictions
y <- rbinom(N, 1, p)    # simulated true classes
r <- pROC::roc(y, p)
coords(r)

################################################################################

> coords(r) threshold specificity sensitivity 1 -Inf 0.0 1.0 2 0.02950050 0.1 1.0 3 0.04154948 0.2 1.0 4 0.07094372 0.3 1.0 5 0.09818741 0.4 1.0 6 0.11496482 0.5 1.0 7 0.13298654 0.6 1.0 8 0.16426112 0.7 1.0 9 0.22426518 0.8 1.0 10 0.41298569 0.8 0.9 11 0.60549653 0.8 0.8 12 0.68211752 0.8 0.7 13 0.75138334 0.8 0.6 14 0.78644297 0.8 0.5 15 0.79864440 0.8 0.4 16 0.81317781 0.8 0.3 17 0.84750176 0.9 0.3 18 0.88146272 1.0 0.3 19 0.92519032 1.0 0.2 20 0.98128671 1.0 0.1 21 Inf 1.0 0.0

EDIT

The comments correctly point out that you can get perfect sensitivity or specificity in the above example without going to infinite thresholds. This is because my example simulates probability predictions from a model. However, the roc function can handle any continuum of values, such as log-odds.

set.seed(2023)
N <- 20
p <- rbeta(N, 1/2, 1/2) # simulated probability predictions
y <- rbinom(N, 1, p)    # simulated true classes
r <- pROC::roc(y, log(p/(1-p)))
pROC::coords(r)

As the probabilities get very small or very large, the log-odds approach $-\infty$ and $+\infty$, respectively.

EDIT 2

The reason there are perfect sensitivity and specificity values in the R output is because the simulation happens not to give $1$ for any of the very low p values and not to give $0$ for any of the high values for the very high p values. In theory, though, for any probability in $(0,1)$ (or any real value of the log-odds), any point can have any true category, no matter how remote the chance of that happening, necessitating the $\pm\infty$ thresholds as the only ways of assuring perfect sensitivity (catch all $1$s, no matter how many $0$s are misclassified) or perfect specificity (catch all $0$, no matter how many $1$s are misclassified).

Dave
  • 62,186
  • 1
    "The only way to get perfect sensitivity is to classify everything as the positive class" That isn't true, and your example shows it. – Ben Reiniger Jan 26 '23 at 22:17
  • I think it is worth noting that for most probabilistic classifiers who already use probit/logit link functions, using $[0, 1]$ would be fine too. The issue is that we can define ROC for any continuous metric. – usεr11852 Jan 26 '23 at 22:37
  • @BenReiniger I wonder where exactly you see such a counterexample in the ROC shown in Dave's example. You realize classifying each observation as a positive equates to setting the threshold to $- \infty$ correct? – AdamO Jan 26 '23 at 22:44
  • @BenReiniger I have made some edits in light of your insightful comment. – Dave Jan 27 '23 at 02:07
  • @AdamO e.g. at 0.2 the sensitivity is still 1. Dave has addressed that in the second edit section, plus the addition of the phrase "assured of" at the beginning. – Ben Reiniger Jan 27 '23 at 05:34
  • @BenReiniger ah the classic confusion of "for this particular problem" vs "for any particular problem". – AdamO Jan 27 '23 at 15:10
  • Thank you for your edits. +1 – usεr11852 Jan 27 '23 at 18:02