12

I have been looking into the imbalanced learning problem, where a classifier is often expected to be unduly biased in favour of the majority class. However, I am having difficulties identifying datasets where class imbalance is genuinely a problem and furthermore, where it is actually a problem, that it can be fixed by re-sampling or re-weighting the data.

Can anyone give reproducible examples of real-world (not synthetic) datasets where re-sampling or re-weighting can be used to improve the accuracy (or equivalently misclassification error rate) for some particular classifier system (when applied in accordance with best practice)?

I am only interested in accuracy as the performance measure. There are some tasks where accuracy is the quantity of interest in the application (see e.g. my answer to a related question), so I would appreciate it if there were no digressions onto the topic of proper scoring rules, or other performance measures.

It is not an example of the class imbalance problem if the operational class frequencies are different to those in the training set or the misclassification costs are not equal. Cost-sensitive learning is a different issue.

Dikran Marsupial
  • 54,432
  • 9
  • 139
  • 204
  • 1
    Comments are not for extended discussion; this conversation has been moved to chat. – Sycorax Jan 05 '22 at 14:55
  • 2
    We should never use AUC as an objective function but rather use full information continuous measures such as deviance. And avoid classification at all costs, by using probability models, unless you are in an ultra-high signal:noise ratio situation such as playing games or simple pattern recognition in ML. – Frank Harrell Jan 05 '22 at 14:55
  • @Sycorax you moved far too much of it to chat. – Frank Harrell Jan 05 '22 at 14:57
  • "It is not an example of the class imbalance problem if the operational class frequencies are different to those in the training set" - What does "operational" mean here? Do you mean you are NOT interested in cases where the population and the training-sample have different class frequencies? Eg in survey sampling we may deliberately oversample minority groups when collecting the dataset, and so we often choose to re-weight our training data (to improve model accuracy among other things). It sounds like this is NOT what you have in mind. But if it is, I'd be happy to post a concrete example. – civilstat Aug 02 '22 at 01:42
  • 1
    @civilstat "operational" means "in use" (i.e. the conditions where the classifier will be used in practice). Yes, where training set and operational conditions are different is a different problem. I am interested in problems where the classifier is supposed to have an undue bias against the minority class in the training set conditions. Most applications of e.g. SMOTE are justified simply because there is an imbalance. I don't think that is a good justification, so I am looking for an example of that where these methods are demonstratively beneficial. – Dikran Marsupial Aug 02 '22 at 05:17
  • 1
    Followup over at DS.SE: https://datascience.stackexchange.com/q/121003/55122 – Ben Reiniger Apr 21 '23 at 18:15
  • Again with a modest bonus. It is interesting to see how things differ between the two communities, I know i should be more active than I am on the other SA. – Dikran Marsupial Apr 21 '23 at 18:58

0 Answers0