I'm working on a dataset with some outliers in the response variable which are actually natural results (not errors). I want to calibrate a model which could then be used to predict on populations outside the training dataset. Therefore, to assess its performance, I split my dataset in training and test splits by a 0.85 rate.
My question is whether the outliers should be removed prior to data splitting or after in the training dataset? I want to delete my outliers because they lower the performances of my model for what I would call common individuals.
