0

When is the case that a classifier's accuracy is better when no data augmentation has been applied (Compared to the dataset with the augmentations applied)? I'd like to know especially how the data is distributed in that case.

Cork
  • 3
  • 2
  • 3
    How do you define "data augmentation"? – user2974951 Aug 28 '23 at 11:42
  • 3
    If you are using proportion "classified" correctly as the accuracy score, something even wilder can happen when there is natural imbalance in outcome category frequencies: you may get better "accuracy" by using no data and just classifying every observation as being in the majority category. It would be a good idea to study proper accuracy scoring rules. – Frank Harrell Aug 28 '23 at 11:56
  • What I meant as "data augmentation" was some set of transformations such as [horizontal flip, color jitter, affine]. – Cork Aug 28 '23 at 12:10
  • How would a "good" accuracy scoring rule be? (If we cannot define the rule, at least how can we figure it out?) – Cork Aug 28 '23 at 12:17
  • For scoring rules see; https://stats.stackexchange.com/questions/126965/choosing-among-proper-scoring-rules – kjetil b halvorsen Aug 31 '23 at 03:03
  • it's better when the original data set does not vary in the way assumed by the data augmentation. if base images have only one orientation, then flipping will make the classification more complex. – seanv507 Aug 31 '23 at 06:57

0 Answers0