If two simple OLS coefficients are positive, can they flip signs during multiple OLS?

Question

I was asked this during an interview, and I'm curious if my thinking is correct.

Fit linear regression twice to two features, $x_1$ and $x_2$. You get two coefficients $\beta_1$ and $\beta_2$, both greater than $1$. Now fit linear regression to both features at the same time. Can either coefficient be negative?

My intuition is that yes, the coefficient sign can flip, if $x_1$ and $x_2$ are collinear. OLS parameter estimates are unstable here since the normal equation requires inverting the Gram matrix $\mathbf{X}^{\top} \mathbf{X}$, which has the same rank as $\mathbf{X}$. (1) Am I correct and (2) if so, is my analysis thorough? Not sure if there's anything else I should consider here or a better way to explain why the coefficients can flip signs.

This is a FAQ: see https://stats.stackexchange.com/search?q=regression+change+sign. — whuber, Aug 03 '21 at 12:34

score 3 · Accepted Answer · answered Aug 03 '21 at 12:28

Yes, they can flip sign if they are correlated. Arguing this mathematically is likely possible, but we can just demonstrate that this can happen with simulation.


set.seed(0)
Generate correlated covars
X = MASS::mvrnorm(100, c(0,0), matrix(c(1, 0.99, 0.99, 1), nrow = 2))
Use them to generate observations.  Only the first column has effect on y
y = X %*% c(2, 0) + rnorm(100, 0, 0.4)
Estimate 3 models: 2 with only one variable and 1 with both
m1 = lm(y~X[,1])
coef(m1)
>>> (Intercept)      X[, 1] 
 0.02606534  2.03186570
m2 = lm(y~X[,2])
coef(m2)
    (Intercept)      X[, 2] 
 >>>0.04038971  1.96816682
m = lm(y~X)
coeff(m)
>>> Coefficients:
(Intercept)           X1           X2

    0.02581      2.07047     -0.03831
```

If two simple OLS coefficients are positive, can they flip signs during multiple OLS?

1 Answers1

Generate correlated covars

Use them to generate observations. Only the first column has effect on y

Estimate 3 models: 2 with only one variable and 1 with both