0

I have a data which is Non-Normal and has some outliers. I tried fitting Linear regression after removing outliers and robust regression without removing outliers. I wish to check which would be better. So how can we validate this in R ?

Anthony
  • 1,612
  • By using cross-validation? – amoeba Feb 11 '15 at 12:52
  • Please add more information to your question. What made you think some values were outliers? How did you assess normality? Is normality known not to be a feature of this kind of data? What kind of data is it? What are you expecting to find and is there a theoretical motivation for the regression model? (don't just address this in comments, add it to your question) – John Feb 11 '15 at 13:59

1 Answers1

1

It's not data that you worry about normality with, it's residuals. Do a search here for several questions on that. You need to first fit a conventional model and check the residuals.

The problem you've identified isn't solvable because you're trying to assess the fit to different sets of data. Once you remove the values that you believe are outliers you've changed the data set and it cannot be compared to another data set.

You should also search removing outliers on here. You'll get some useful information that will guide you in whatever you're doing.

That said, there is a problem here you can get advice on and that's what kind of model to fit to your data. That requires much more information than you've provided.

John
  • 22,838
  • 9
  • 53
  • 87