5

Suppose that $S_i$ is continuously distributed, not necessarily non-negative. The conditional expectation function of interest is $h(t):=E[Y_i|S_i=t]$ has derivative $h'(t)$.

Equation 3.3.8 of Mostly Harmless Econometrics is:

$$\frac{E[Y_i(S_i- E[S_i])]}{E[S_i(S_i-E[S_i])]} = \frac{\int h'(t)\mu_t dt}{\int \mu_t dt} $$ where $\mu_t :=[E[S_i|S_i\ge t]-E[S_i |S_i<t]][P(S_i\ge t)[1-P(S_i\ge t)]]$ and the integrals run over the support of $S_i$.

That equation is not obviously true to me and I am looking for a proof.

Michael Gmeiner
  • 3,643
  • 2
  • 15
  • 2
    You can find a discrete version of the derivation in Appendix A2 of Angrist and Krueger (1999): Empirical strategies in labor economics in the Handbook of Labor economics (Ashenfelter and Card) volume 3. link – tdm Apr 21 '22 at 12:33
  • The "+" sign should be "-" in the inner integral – laozhang Oct 25 '23 at 03:58

1 Answers1

3

The appendix to that section in Mostly Harmless, section 3.5, has a derivation.

$$Cov(Y_i,S_i) = E[h(S_i)(S_i-E[S_i])]$$

Let $k_{-\infty}=\lim_{t\rightarrow -\infty} h(t)$. By the fundamental theorem of calculus,

$$h(S_i)=k_{-\infty} +\int_{-\infty}^{S_i} h'(t)dt $$

Thus, $$ E[h(S_i)(S_i-E[S_i])] = \int_{-\infty}^\infty\int_{-\infty}^{S_i} h'(t) (s-E[S_i])g(s)dtds$$

where $g(s)$ is the density of $S_i$ at $s$. Apply Fubini's theorem to switch the order of integration, $$ E[h(S_i)(S_i-E[S_i])] = \int_{-\infty}^\infty h'(t)\int_{t}^{\infty} (s-E[S_i])g(s)dtds$$

The inner integral is $E[S_i|S_i \ge t]Pr(S_i\ge t)+E[S_i]Pr(S_i \ge t) $

$= E[S_i|S_i \ge t]Pr(S_i\ge t)+(E[S_i|S_i \ge t]Pr(S_i \ge t) + E[S_i|S_i < t]Pr(S_i < t))Pr(S_i \ge t) $

$=\mu_t :=(E[S_i |S_i \ge t]-E[S_i |S_i <t])(Pr(S_i \ge t)(1-Pr(S_i \ge t)$

Then, setting $S_i =Y_i$, the denominator can similarly be derived.

Michael Gmeiner
  • 3,643
  • 2
  • 15