0

Assume original data contains 1000 goods and 1 bad I build a logistic regression and use the the model to score the bad and I get probability = 0.00001 Then I use oversampling/undersampling to increase/decrease the original data so now I have 1000 goods and 1000 bags if I use oversampling. Then I build a logistic model use the data and apply the model to the original data then for that bad I get probability = 0.5. However this probability need to be adjusted to reflect original data so after doing some math you get adjusted probability lower than 0.5 (for example 0.00001 ) so what is the point of oversampling/undersampling if you are required to adjust the probability?

gyambqt
  • 61

0 Answers0