Assume that I have 10 classes with 100 samples for each class—same # of samples, perfect balanced dataset.
I want to add 3 new classes, and which of the following is the best option for the number of samples for each newly added class?
- 100, 100, 100
- 200, 200, 200
- 1000, 1000, 1000
- 100, 1000, 1000
I am creating a data analysis product for my clients, and I have to make a threshold for the minimum (or maximum) number of samples they have to add.
It depends on the datasets we are using, but balanced data is almost always better than imbalanced.
However, I am pretty not sure how I can set the threshold for the newly added class' data sample number if I can choose it.