Questions referring to classifiers or classifying problems where some of the classes in the data are under-represented.
Questions tagged [class-imbalance]
556 questions
3
votes
2 answers
Predicting positive/negative experience with very few labels and labels from only one class
I have video viewing data (length of session, nb of videos, etc), as well as if the user clicked on the like button. We can use the like button as a confirmation that the user had a positive viewing experience, however, only 0.1% of users click on…
VincFort
- 181
- 6
2
votes
1 answer
Why does prediction calibration on a resampling mode does not meet the expectation?
I am doing a small project to predict the write-off probability of our defaulted customers.
In the original population, the write-off rate is about 0.515. Now, for some reason I have to undersample the population and populate a new data set in which…
simohayha
- 21
- 3
0
votes
2 answers
Imbalanced dataset - Undersampling & multiple classifiers
Let's suppose that my dataset in a classification problem looks like that:
class A: 50000 observations
class B: 2000 observations
class C: 800 observations
class D: 200 observations
These are some ways which I considered to deal with this…
Outcast
- 1,057
- 2
- 12
- 29
0
votes
3 answers
Classifying on imbalanced dataset
I have incidents VS normal operation of my working environment. It is a skew dataset. My prediction accuracy is 95%.
Question:
1. Is it common practice among data scientist to accept this prediction?
2. Do I have to rework by resample and balance…
ii2
- 111
- 5
0
votes
0 answers
Is there a way to artificially manipulate a dataset in order to replace it for one that gives good results?
I'm trying to artificially create a dataset for pure educative reasons but I want it to be based in one particular dataset, the problem is that this original dataset don't make good predictions even with the most powerful methods (with the…