How to handle imbalance with neural networks?

Question

I am trying to do classification on multi-class and multi-label data using feed-forward networks. I have mostly been using keras.

Here's all that I have tried to address imbalance in my text data, but to no avail.

using loss functions like binary cross entropy, cross-entropy, focal loss, source
using class weights.

I am also looking into trying CNNs instead of feed-forward networks as demonstrated here.

So, apart from the loss functions and weights, which mostly do not work, is there something else that can be done to counter the imbalance and have a good and usable classifier on all the classes?

How have you diagnosed a problem with class imbalance (note classifying all patterns to the majority class may be the optimal solution). I have a question about that here https://stats.stackexchange.com/questions/539638/how-do-you-know-that-your-classifier-is-suffering-from-class-imbalance I may put another bounty if there is a good answer. — Dikran Marsupial, Jun 14 '22 at 11:52
The reason for steps like resampling is because the misclassification costs are unequal. The best thing to do is to work out what the misclassification costs are for your application and then apply them when training (i.e. minimum risk classification). "usable" depends on the requirements of the application, and can't really be judged without the details. — Dikran Marsupial, Jun 14 '22 at 11:54
the way i am diagnosing the class-imbalance is the same way it is mentioned in your answer to the question you have linked, that is "when there are two few examples of the minority class to adequately characterise it's statistical distribution.". for multi-class and mutli-lable problems, i usually one-hot encode my classes. and there is a very high imbalance sometime between the class/label that occured the most in relation the one that occurred that least, it can be something like 0.27:0.12:0.19:...:0.009. i have also had problems that dealt with more than 250 labels. — Naveen Reddy Marthala, Jun 15 '22 at 05:00
@NaveenReddyMathala ""when there are two few examples of the minority class to adequately characterise it's statistical distribution."" that means you may have a problem with the imbalance, not that you do have a problem. In that case it will be diifficult to know how much you need to resample as you don't have enough data to estimate that. You should first work out the misclassification cost matrix an use that in training the model. — Dikran Marsupial, Jun 15 '22 at 05:34
" it will be diifficult to know how much you need to resample as you don't have enough data to estimate that." by this, do you mean undersampling the majority classes? and by "You should first work out the misclassification cost matrix an use that in training the model." by this, how do i use the classification matrix in training the model? — Naveen Reddy Marthala, Jun 15 '22 at 08:17
Look up "minimum risk classification" or "cost-sensitive learning" and you should find all the information you need. — Dikran Marsupial, Jun 15 '22 at 09:33
Yes, that is normally how it is implemented, but the degree of imbalance is irrelevant to the calculation of the weights, only the misclassification costs — Dikran Marsupial, Jun 15 '22 at 11:15
I didn't understand "the degree of imbalance is irrelevant to the calculation of the weights, only the misclassification costs". — Naveen Reddy Marthala, Jun 16 '22 at 04:41
When using cost sensitive learning, the amount of imbalance in the dataset does not appear anywhere in the calculation of the optimal value of the weights, they depend only on the misclassification costs. In other words the imbalance is not the problem, the problem is the unequal misclassification costs. If you misclassification costs are equal, it is likely that the model is already doing the right thing, — Dikran Marsupial, Jun 16 '22 at 07:11

How to handle imbalance with neural networks?

0 Answers0