I have a dataset with 4 labels. For me the most important is to be able to distinguish label 1 from all other labels, I don't care that much about distinguishing between labels 2,3 and 4. The proportions of data with each of the labels are not very different from each other (so no evident imbalance) .
I thought that merging labels 2,3,4 into a single label would make classification easier but it does not seem to be the case. The performance (of distinguishing label 1 from all the rest) with xgboost seems to be consistently better when I run multiclass classification, then when I run binary classification.
I can imagine it can be due to special geometry of the data so I wonder: does this phenomenon have a name in the literature? Is it possible to check directly for some specific data features that enable such behavior of the classifier?