Is program evaluation (DiD, RD) a structural estimation?

Question

Consider program evaluation methods such as IV, Diff-in-Diff, and RD.

According to Haile (2021):

"Typically program evaluation requires more than descriptive analysis: one must counterfactually hold all else equal to learn the true effect(s) of D on Y , given X. This means treating F (Y ,D,X) (or a functional of F like LATE) as the counterfactual quantity of interest and using appropriate econometric techniques (IV, diff-in-diff, RD,. . . ) to estimate it."

I think of "structural estimation" yields "deep parameter" estimations, and the program evaluation studies usually (via DiD, RD) do not provide.

Do you agree with Haile's argument above that the impact evaluation studies are "structural estimation" methods?

easyliving · Accepted Answer · 2022-08-16T04:08:24.897

I skimmed through the slides you linked.

I think professor Haile in these slides is trying to introduce the concepts of "structural" and "reduced-form" models in a very broad sense.

However, through the past 5 or 6 decades of development in the econometrics literature, the exact meaning of the words "structural" and "reduced-form" changed slightly, depending on what models or ways of modelling you are talking about.

And if these terms are used in an unqualified way, confusion is bound to rise.

Ok. To really understand these terms, let's first go back to the simultaneous equations models (SEM).

Of course, the idea of SEM dates back at least to Marshall. The econometric study of SEM is also very old, and dates back at least to Tinbergen. (Don't quote me on this. I am not a historian of economic thought.)

For example, price and quantity in a market can be analyzed by two equations $P = F_P(Q, Z, e_P)$ and $Q = F_Q(P, Z, e_Q)$. Here, $Z$ is exogenous variables outside the scope of the model. $e_P$ and $e_Q$ are unobserved variables that appear in the price and quantity equations, respectively.

These two equations are called simultaneous because they together determines the equilibrium price and quantity of the particular market given $(Z, e_P, e_Q)$.

In vector notation, we write $F = (F_P, F_Q)'$ and $e = (e_P, e_Q)'$ so that we can write the SEM as $$ \left(\begin{matrix} P \\ Q \end{matrix}\right) = F(P, Q, Z, e). $$

This equation is roughly the same as the equation $$F(Y, D, X)=0$$ which is used in these slides.

To see this, let $Y=P$, $Q=D$, and $X=(Z, e)$. Although strictly speaking, the equation $F(Y, D, X)=0$ is still more general than $F(Y, D, X)= (Y, D)'$.

Back to the price-quantity example, suppose the $F_P$ and $F_Q$ are linear, that is, $P = \alpha_1 Q + \beta_1' Z + e_P $ and $Q = \alpha_2 P + \beta_2' Z + e_Q$.

Writing in matrix notation, we have $$ \left(\begin{matrix} P \\ Q \end{matrix}\right) = \left(\begin{matrix} 0 & \alpha_1 \\ \alpha_2 & 0 \end{matrix}\right) \left(\begin{matrix} P \\ Q \end{matrix}\right) + \left(\begin{matrix} \beta_1' \\ \beta_2' \end{matrix}\right) Z + \left(\begin{matrix} e_P \\ e_Q \end{matrix}\right). $$ Re-define this equation into the form $$ \left(\begin{matrix} P \\ Q \end{matrix}\right) = A \left(\begin{matrix} P \\ Q \end{matrix}\right) + B Z + e. $$

Suppose $I - A$ is non-singular, then we find the solution to the SEM as $$ \left(\begin{matrix} P \\ Q \end{matrix}\right) = (I - A) ^ {-1} B Z + (I - A) ^ {-1} e. $$

This last equation is what is called a "reduced-form" equation in the SEM literature.

The key idea here is that the "reduced form" is the algebraic solution of the "structural form" in the context of SEM.

Applying this idea to the general case of $F(Y, D, X)=0$ requires there exist implicit function $G$ (maybe guaranteed by implicit function theorems locally, etc.) such that $$ \left(\begin{matrix} Y \\ D \end{matrix}\right) = G(X) = G(Z, e). $$

So $F$ here can be thought of as the "structural" equation, and $G$ the "reduced-form" equation.

Fast forward to the past 30 years, researchers want to add causal interpretation to these previously purely statistical models.

And, to further complicate things, there are different ways to do this.

In a Rubin Causal Model, we write $Y = Y(1) D + Y(0) (1 - D) $ for a binary treatment variable $D \in \{ 0, 1 \}$. Here $Y(0)$ and $Y(1)$ are counterfactual outcomes revealed by the treatment and otherwise unobserved. Average treatment effect is defined by these counterfactuals as in $$ ATE = E(Y(1)) - E(Y(0)). $$

In Pearl's language (Pearl 2000, 2009), which reflects many statisticians' view that "no causation without manipulation" (see Holland 1986), we can define average treatment effects by $$ ATE = E(Y|do(D=1)) - E(Y|do(D=0)).$$ Here the "do" notation emphasize that the experimenter can freely assign treatments to individuals.

How do we blend these causal frameworks into the structural form and/or reduced form above?

Maybe one can find a variable $Z_1$ in $Z$, that appears in the structural equation for $Y$ but not in the structural equation for $D$ (this is said to be a triangular system). In this case, one can pick the value of $Z_1$ to create variation in treatment $D$ and then observe the effects $D$ has on the outcome $Y$.

Or we can abandon the ATE, but instead focus on LATE as in Imbens and Angrist (1994).

Chuck, this is an excellent post and also a nice review of where the terminology stems from. Your last three sentences are also quite important to answering my original question. For example, I am looking at an explicit treatment of what Haile (2021) is saying about the program evaluation studies. For example, in his slides, I understand in the most abstract form what $F(Y,D,X)$ means and why this is a structural form. But the disconnect, at least to me, is how does this establish, for example, the difference-in-differences is a structural estimation method? — Frank Swanton, Aug 14 '22 at 20:25
@RichardHardy Thanks for pointing this out. I've corrected the typo. — easyliving, Aug 16 '22 at 03:46
@FrankSwanton Oops. Seems like I didn't really answer you question. ; ) I probably wrote too much and forgot the point I was trying to make. I read the "program evaluation" part in the slides again. Oh man, there is some interesting stuff there! Here I quote: "Program evaluation (indeed, any type of “causal inference”) is always a form of structural estimation. ... TT, ATE, LATE, QTE, etc. are all precisely defined only under a well specified model of how the data are being generated. Any suggestion that these objects are “model free” is nonsense." — easyliving, Aug 16 '22 at 03:56
@FrankSwanton ... Quote again from the slides: “I literally died laughing when I heard what Josh Angrist said about empirical IO!” This is gold, man! LOL I guess what is happening here is that as an econometrician, professor Haile is trying to point out that all inference (whether causal or not) is model-based in the ultimate analysis. And he doesn't seem convinced by the labor guys' touting about the superiority of their so-called "reduced-form methods" in pinning down causal effects. — easyliving, Aug 16 '22 at 04:05

Is program evaluation (DiD, RD) a structural estimation?

1 Answers1