Metric to evaluate binary classification model on imbalanced dataset in order to meet percentage limitations

Question

For a university class I'm working on a imbalanced dataset that has ratio of 43:1 Class_0 to Class_1.

Class_1 refer to companies that have declared bankruptcy based on feature columns of the dataset. On the other hand Class_0 refer to companies that haven't. My goals is to create the best possible classification model that will identify those companies that will go bankrupt but I have the following two limitations:

The model must find with a success rate of at least 62% the companies that will go bankrupt. The model must find with a success rate of at least 70% the companies that won't go bankrupt. I've implemented my python code in Colab and I've run several models (Logistic Regression, Naive Bayes, k-Nearest Neighbors, SVM, Neural Networks, etc) with of course tuninig their hyperparameters and I have pass the results of some metrics in a .xlsx file. Of course I have noted down results for the same dataset but after undersampling the majority Class_0 in training set and bringing down the ratio to 3:1.

I asked my professor which metric should I use for meeting the percentage limitations of 62% and 70% I mentioned earlier and he told me that it depends from which class is considered positive and that I should look up here: Confusion Matrix-Wikipedia.

My .xlsx file looks somethink like this:

Unbalanced
Balanced

Also after Model.fit() I print a classification report to help me understand what my model finds for each class:

My question is what metric should I use from the link for the evaluation he wants?

You have to meet two performance criteria, not just one. That is, no matter how good some combined criterion is, if you don’t get the required 62 or the required 70, your model is inadequate. // Your task is your task and probably not something you can change, but do note the problems with threshold-based metrics in many cases. Related, note that you are not bound to use $0.5$ as a threshold for making a hard classification. — Dave, Nov 10 '22 at 19:04
@Dave thanks for your answer, it helped me understand some things. — pchi, Nov 10 '22 at 19:23

score 0 · Answer 1 · answered Nov 15 '22 at 04:18

You need to meet both criteria, and that’s all there is to it. No matter what any single metric says, if you fall short of either the $62\%$ or $70\%$ requirements, your model is inadequate.

Good $F_1$ score but inadequate sensitivity? FAIL

Good $AUC$ but inadequate specificity? FAIL

You are not in a position where you can consider just one metric.

Metric to evaluate binary classification model on imbalanced dataset in order to meet percentage limitations

1 Answers1