0

I've been handed a binary classification model to look after. The model uses the F1 score for comparison purposes. The challenge is that the F1 score against the test dataset is very high, like 99%, which does not sound very likely considering the business question asked... What typical issues cause such a high figure?

I'm looking into the following:

  1. Is there any leakage
  2. There is imputation taking place during data cleansing. Things like mean and mode imputation for numerical fields. I have concerns this is reducing variance, and am pushing for MICE/similar if possible
  3. There are outliers in the dataset which should be removed
  4. The data set is imbalanced but I am taking counsel from this page

Are there any other thoughts on what to do here? Normally the challenge is to increase the F1 score, not lower it.

DanDanDan
  • 115
  • Have you tried visualizing your data? Try looking at histograms or density plots of continuous input features grouped by the class label, and contingency tables of categorical input features by class label – jkpate May 05 '22 at 14:59
  • How imbalanced is the dataset? Extreme imbalance could make a 99% F1 score not mean that much. You might try a threshold-independent performance metric such as AUROC and see if you're still seeing extremely high metrics. – chang_trenton May 05 '22 at 19:14

0 Answers0