Suppose I have a data set where each row represents a test subject. There's a dependent variable (y) and two binary columns (x1, x2).
| y | x1 | x2 |
|---|---|---|
| 10 | 0 | 0 |
| 12 | 0 | 1 |
| 9 | 1 | 0 |
| 13 | 1 | 1 |
There are four groups of people (4 possible combinations of x1 and x2). I want to calculate the average treatment effect of $x_1$ for each type of $x_2$. That is:
$$d_1 = E(Y|x_1=1, x_2=0) - E(Y|x_1=0, x_2=0)$$ $$d_2 = E(Y|x_1=1, x_2=1) - E(Y|x_1=0, x_2=1)$$
How does this approach compare to the following regression model? $$Y_i = \beta_0 + \beta_1 x_{1i} + \beta_{2i} x_2 + \beta_3 x_{1i} x_{2i} + \varepsilon_i$$
Is it true that $d_1 = \hat{\beta}_1$ and $d_2 = \hat{\beta}_1 + \hat{\beta}_3$?
I am asking this because I tried both approaches and the equalities do not hold by a relatively large margin.