Does Class Imbalance in Audio Classification Occur When There's an Imbalance in Mean Audio Durations or in Number of Instances of the Classes?

Question

I'm currently working on a speech emotion recognition task, and I was curious to know whether my dataset is imbalanced or not.

The value counts for each class are roughly uniform except for one class whose number of occurrences is about a third of the mode. But when I get the duration for each sound file and compute the mean duration for each class, I find that the mean durations are in fact equal. So my question is, do longer sound files provide the model with more information, in which case, no class imbalance exists, or is it perceived by the model to be the same amount of information as any other instance, in which case, class imbalance does exist?

Welcome to Cross Validated! Do you mean something like having $500$ instances of category $A$, and each lasts $30 seconds$, while you have $1000$ instances of category $B$, each of which lasts $15$ seconds? (Consequently, you have $15000$ seconds of audio for each class.) // So what if there's a class imbalance? Class imbalance is minimally problematic. — Dave, May 24 '22 at 15:01
Hi, Dave. Thank you kindly for the response! Yes, you've got that right. That article is fantastic. I was also highly skeptical about the danger of class imbalance up until now. Thanks very much for sharing that. — AlePouroullis, May 24 '22 at 15:14

Does Class Imbalance in Audio Classification Occur When There's an Imbalance in Mean Audio Durations or in Number of Instances of the Classes?

0 Answers0