Difference in means vs OLS regression coefficients

Question

Suppose I have a data set where each row represents a test subject. There's a dependent variable (y) and two binary columns (x1, x2).

y	x1	x2
10	0	0
12	0	1
9	1	0
13	1	1

There are four groups of people (4 possible combinations of x1 and x2). I want to calculate the average treatment effect of $x_1$ for each type of $x_2$. That is:

$$d_1 = E(Y|x_1=1, x_2=0) - E(Y|x_1=0, x_2=0)$$ $$d_2 = E(Y|x_1=1, x_2=1) - E(Y|x_1=0, x_2=1)$$

How does this approach compare to the following regression model? $$Y_i = \beta_0 + \beta_1 x_{1i} + \beta_{2i} x_2 + \beta_3 x_{1i} x_{2i} + \varepsilon_i$$

Is it true that $d_1 = \hat{\beta}_1$ and $d_2 = \hat{\beta}_1 + \hat{\beta}_3$?

I am asking this because I tried both approaches and the equalities do not hold by a relatively large margin.

Although the estimates will be the same, the standard errors of estimate may differ, depending on exactly how you model "this approach." — whuber, Apr 28 '22 at 15:40

score 3 · Accepted Answer · answered Apr 28 '22 at 14:15

Remember that linear regression involves a conditional expectation. It is very easy to show that, under assumptions,

$$E(Y|x_1=A, x_2=B)=\beta_0 + \beta_1 A + \beta_{2} B + \beta_3 AB$$

Then:

$$d_1 = E(Y|x_1=1, x_2=0) - E(Y|x_1=0, x_2=0) = (\beta_0 + \beta_1) - (\beta_0)=\beta_1$$ $$d_2 = E(Y|x_1=1,x_2=1) - E(Y|x_1=0, x_2=1) = (\beta_0 + \beta_1 + \beta_{2} + \beta_3)- (\beta_0 + \beta_{2})=\beta_1+\beta_3$$

You said you tried both approaches, and they didn't match. It's hard to answer why that happened, but I'll offer a numerical example below:

x1 = sample(c(0,1), 100, replace = TRUE)
x2 = sample(c(0,1), 100, replace = TRUE)
y = rnorm(100)
Y_x = data.frame(y,x1,x2)
fit = lm(y ~.*., data = Y_x)
c(mean(y[x1 == 1 & x2 == 0]) - mean(y[x1 == 0 & x2 == 0]),coef(fit)[2])
x1
#-0.3181978 -0.3181978 
c(mean(y[x1 == 1 & x2 == 1]) - mean(y[x1 == 0 & x2 == 1]),coef(fit)[2]+coef(fit)[4])
x1
#0.03063305 0.03063305

Difference in means vs OLS regression coefficients

1 Answers1

x1

x1

Linked