1

Suppose I have a data set where each row represents a test subject. There's a dependent variable (y) and two binary columns (x1, x2).

y x1 x2
10 0 0
12 0 1
9 1 0
13 1 1

There are four groups of people (4 possible combinations of x1 and x2). I want to calculate the average treatment effect of $x_1$ for each type of $x_2$. That is:

$$d_1 = E(Y|x_1=1, x_2=0) - E(Y|x_1=0, x_2=0)$$ $$d_2 = E(Y|x_1=1, x_2=1) - E(Y|x_1=0, x_2=1)$$

How does this approach compare to the following regression model? $$Y_i = \beta_0 + \beta_1 x_{1i} + \beta_{2i} x_2 + \beta_3 x_{1i} x_{2i} + \varepsilon_i$$

Is it true that $d_1 = \hat{\beta}_1$ and $d_2 = \hat{\beta}_1 + \hat{\beta}_3$?

I am asking this because I tried both approaches and the equalities do not hold by a relatively large margin.

  • 1
    Although the estimates will be the same, the standard errors of estimate may differ, depending on exactly how you model "this approach." – whuber Apr 28 '22 at 15:40

1 Answers1

3

Remember that linear regression involves a conditional expectation. It is very easy to show that, under assumptions,

$$E(Y|x_1=A, x_2=B)=\beta_0 + \beta_1 A + \beta_{2} B + \beta_3 AB$$

Then:

$$d_1 = E(Y|x_1=1, x_2=0) - E(Y|x_1=0, x_2=0) = (\beta_0 + \beta_1) - (\beta_0)=\beta_1$$ $$d_2 = E(Y|x_1=1,x_2=1) - E(Y|x_1=0, x_2=1) = (\beta_0 + \beta_1 + \beta_{2} + \beta_3)- (\beta_0 + \beta_{2})=\beta_1+\beta_3$$

You said you tried both approaches, and they didn't match. It's hard to answer why that happened, but I'll offer a numerical example below:

x1 = sample(c(0,1), 100, replace = TRUE)
x2 = sample(c(0,1), 100, replace = TRUE)
y = rnorm(100)

Y_x = data.frame(y,x1,x2)

fit = lm(y ~.*., data = Y_x)

c(mean(y[x1 == 1 & x2 == 0]) - mean(y[x1 == 0 & x2 == 0]),coef(fit)[2])

x1

#-0.3181978 -0.3181978 c(mean(y[x1 == 1 & x2 == 1]) - mean(y[x1 == 0 & x2 == 1]),coef(fit)[2]+coef(fit)[4])

x1

#0.03063305 0.03063305

Firebug
  • 19,076
  • 6
  • 77
  • 139