1

I'm curious to see if there are any useful metrics to evaluate classification models using numeric probabilities.

Traditionally, I would train a classification model, generate factor predictions on the test set, and use a confusion matrix or ROC curve to decide on the best models. However, in this instance, I'm interested in doing model evaluation from looking at the numeric probablities.

Update

An example of what I'm talking about is this: I fit a multiple models and have it predict classes on the test set. Usually I can create a confusion matrix

Model 1

     Yes  No

 Yes  10   5
 No    2  13 

Model 2

     Yes  No

 Yes   3  11
 No    8   8

From the confusion matrix, I can clearly tell that model 1 is more accurate than model 2.

How would I evaluate two models if I have them give me numeric probablities instead, for instance:

Model-1 Preds    Model-2 Preds     Test Set 
    .59               .25             No
    .14               .08             No
      .                 .              .
      .                 .              .
      .                 .              .
    .33               .29            Yes  

I have thoughts of discretizing them or converting the yes and no's into 1 and 0's and calculating the residuals. Just wanted to know if there are more formal best practices to use in this case.

Minh
  • 111

0 Answers0