0

Could you please shed some lights about how to interpret linear regresssion results (2-stage vs. 1 stage)?

For example, I have the following:

lmStage1 <- lm(y~x1)
lmStage2 <- lm(residuals(lmStage1)~x2)
summary(lmStage2)

vs.

lmAll <- lm(y~x1+x2)
summary(lmAll)

How do I interpret and compare the coefficients/t-stats, etc. of the above two models?

And how do I compare the two approaches and what observations/diagnosis/studies shall I draw from the above two models?

In general, I feel that I am quite weak in drawing observations and obtaining intuitions from regression studies... are there books focusing on these interpretations and intuitions?

Thanks a lot!

Rob Hyndman
  • 56,782
Luna
  • 2,345
  • And moreover, if my intent was to study the correlation between x1 and x2 and its impact on y... how shall I construct my experiment? – Luna May 30 '12 at 14:43

1 Answers1

2

Your example doesn't really compare with what multiple regression does. Multiple regression shows the relationship with the part of a given x variable that is not associated with the other variables in the model AND the part of y that is not related to all of the other x variables. Here is the comparison, from an example I posted here...

Learning statistical concepts through data analysis exercises

# Create a reproducible model as an example.
set.seed(20120529)
x1 <-rnorm(100,0,1)
x2 <-rnorm(100,0,1)
y  <-.5*x1 + 2*x2 + rnorm(100,0,1)

# Regress y and x1 both on x2 and fit a model with the residuals of both
lmStage1 <- lm(  y ~ x2) 
lmStage2 <- lm( x1 ~ x2)
lmStage3 <- lm(residuals(lmStage1) ~ residuals(lmStage2))

# Regress y and x2 both on x1 and fit a model with the residuals of both    
lmStage4 <- lm(  y ~ x1) 
lmStage5 <- lm( x2 ~ x1)
lmStage6 <- lm(residuals(lmStage4) ~ residuals(lmStage5))

# Take a look at the coefficients from the residual model    
rbind(coefficients(lmStage3),coefficients(lmStage6) )

>       (Intercept) residuals(lmStage2)
> [1,] 1.084382e-16           0.5581899
> [2,] 4.598543e-17           2.1419997

# Compare this to the full multiple regression model
lmAll <- lm(y~x1+x2) 
coefficients(lmAll)

> (Intercept)          x1          x2 
> -0.01497859  0.55818991  2.14199970 
Brett
  • 6,194
  • 3
  • 33
  • 41
  • Cool! I upvoted for you. However, after reading your explanation, it looks like I would like to actually compare the following two: (1) lm(residuals(lm(y~x1))~x2) vs. (2) lm(residuals(lm(y~x1))~residuals(lm(y~x2))) ... Any meaningful observations can I make from this comparison? Thank you! – Luna May 30 '12 at 14:40