I'm rebutting a paper in which the author has done linear regression in two steps:
- Regression against a single predictor
- Regression of the residuals from the first model against a second predictor
The problem with this approach is that if the two predictors are correlated (they are) then it tends to force more of the variance into the first predictor. The results thus depend arbitrarily on the order of the two steps.
In this case if the steps were reversed, or if a proper multivariate regression were performed, then the author would have reached the opposite conclusion.
I've demonstrated all of this, but it would be nice to have a good citation for the problem too - it must be well know. Thanks in advance for your suggestions.