I'm in charge of a scoring project and I'm getting average results with my binary classification model.
A large number of reasons could explain this, but I think one of them is the heterogeneity of the population in my training dataset. I think that giving up on this global model for more specific sub-models could help improving the results.
For example : If we were in a retail company trying to find the best advertizing chanel, we could try to build a model for people shopping on stores and another one for people shopping online.
My question then is : What would you do to justify such sub-models approach ? I have a panel of 4/5 variables being regurlaly used for business strategies potentially with a good discriminatory power. My idea would be to use them to prove that somehow clients shopping only online are so different from the others that they should have their own model.
I read about some technics like T-SNE or Mapper Algorithm (I know of course about the more traditional ones). These technics are well documented online, but I hardly find their implementation for my use case.
I also thought about
Thank you for your help !