Why do the coefficients I get from `statsmodels` look to violate the Frisch-Waugh-Lovell theorem?

Question

As I understand it, the Frisch-Waugh-Lovell theorem implies that if you regress y on x1 and x2, the coefficient on x2 will be the same as if you regress the residuals from regressing y on x1 on x2.

But if you run the below code, they're different. Why is this?

import random
import numpy as np
import pandas as pd
import statsmodels.api as sm
random.seed(10)
target = np.random.normal(0, 1, 1000)
x1 = np.random.normal(0, 1, 1000) + 0.1 * target
x2 = np.random.normal(0, 1, 1000) - 0.1 * target + 0.1 * x1
df_ = pd.DataFrame({'const': 1, 'x1': x1, 'x2': x2})
full_model = sm.OLS(target, df_).fit()
print('x2 coefficient in full model')
print(full_model.params['x2'].mean())
resid = sm.OLS(target, df_[['const', 'x1']]).fit().resid
partial_model = sm.OLS(resid, df_[['x2']]).fit()
print('x2 coefficient in full model')
print(partial_model.params['x2'].mean())

This has been thoroughly explored elsewhere here on CV. See, for instance, https://stats.stackexchange.com/a/113207/919 (theoretical/geometrical) and https://stats.stackexchange.com/a/46508/919 (your approach), https://stats.stackexchange.com/questions/572623 (similar), https://stats.stackexchange.com/a/32237/919 (another example), etc. — whuber, Feb 01 '23 at 19:11

Why do the coefficients I get from `statsmodels` look to violate the Frisch-Waugh-Lovell theorem?

0 Answers0