Questions tagged [class-imbalance]

Questions referring to classifiers or classifying problems where some of the classes in the data are under-represented.

556 questions
3
votes
2 answers

Predicting positive/negative experience with very few labels and labels from only one class

I have video viewing data (length of session, nb of videos, etc), as well as if the user clicked on the like button. We can use the like button as a confirmation that the user had a positive viewing experience, however, only 0.1% of users click on…
VincFort
  • 181
  • 6
2
votes
1 answer

Why does prediction calibration on a resampling mode does not meet the expectation?

I am doing a small project to predict the write-off probability of our defaulted customers. In the original population, the write-off rate is about 0.515. Now, for some reason I have to undersample the population and populate a new data set in which…
simohayha
  • 21
  • 3
0
votes
2 answers

Imbalanced dataset - Undersampling & multiple classifiers

Let's suppose that my dataset in a classification problem looks like that: class A: 50000 observations class B: 2000 observations class C: 800 observations class D: 200 observations These are some ways which I considered to deal with this…
Outcast
  • 1,057
  • 2
  • 12
  • 29
0
votes
3 answers

Classifying on imbalanced dataset

I have incidents VS normal operation of my working environment. It is a skew dataset. My prediction accuracy is 95%. Question: 1. Is it common practice among data scientist to accept this prediction? 2. Do I have to rework by resample and balance…
ii2
  • 111
  • 5
0
votes
0 answers

Is there a way to artificially manipulate a dataset in order to replace it for one that gives good results?

I'm trying to artificially create a dataset for pure educative reasons but I want it to be based in one particular dataset, the problem is that this original dataset don't make good predictions even with the most powerful methods (with the…