2

I'm reading Counterfactuals and Causal Inference by Morgan and Winship. In chapter 6, they discuss OLS as a means of estimating the average treatment effect for a binary exposure $D$ (assuming all assumptions to do so are met). They describe a scenario in which the population can be perfectly stratified by a variable $S = \left\{ 1, 2, 3 \right\} $, and that $S$ is sufficient to block any backdoor paths. Were one to perform a regression in which $S$ is dummy coded, using $S=1$ as the reference group, the estimate from OLS would be equal to

$$ \delta_{OLS} = \dfrac{1}{c}\sum_s \operatorname{Var}[d_i\mid s_i=s] \operatorname{Pr}[s_i=s] \left\{ E[y_i \mid d_i=1, s_i=s] - E[y_i \mid d_i=0, s_i=s] \right\} $$

where $c$ is a scaling constant equal to the sum of conditional variances.

This is fairly surprising to me. I was under the impression that the coefficient from OLS would be an estimate of the ATE averaging over the distribution of $S$ (i.e. the weights would simply be $\operatorname{Pr}[s_i=s]$) , but it seems that OLS is giving more weight to those strata in which the propensity score is closer to 0.5. In the words of the authors "[OLS] can yield estimates that are far from the true ATE even in an infinite sample".

Why does OLS perform conditional variance weighting like this? Can someone demonstrate to me why this is a consequence from the typical setup of OLS?

Noah
  • 33,180
  • 3
  • 47
  • 105
  • I should mention estimates can differ only if there is treatment effect heterogeneity – Demetri Pananos Feb 25 '24 at 17:27
  • 1
    I highly recommend Chattopadhyay & Zubizarreta (2023). It doesn't directly answer the question, but it provides the precise covariate profile the OLS estimator generalizes to and how to get OLS to target the sample distribution of the covariates. – Noah Feb 25 '24 at 18:31
  • This recent answer derives OLS as a precision-weighted stratified estimator, which might also be helpful: https://stats.stackexchange.com/questions/638058/example-showing-equivalence-between-stratification-and-regression/638072#638072 – Thomas Lumley Feb 26 '24 at 00:43

0 Answers0