2

I am testing the movement of a whole sample after an event by using generalized difference-in-differences (DiD) with staggered event dates. The whole sample satisfies the parallel trend assumption.

I want to subsample the whole sample into two parts to see the difference in the movement of two parts (still using DiD for each part). I am wondering whether I must test for parallel trends among each subgroup?

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33
  • 1
    Technically, yes. What do your “subsamples” represent? Is it the early-/late-adopters? Is it small/big firms? Is it European countries/non-European countries? – Thomas Bilach Sep 07 '21 at 17:20
  • @Thomas, I am working on an international sample, so I subsample to continents – Phil Nguyen Sep 07 '21 at 21:20

1 Answers1

1

It's acceptable to run separate difference-in-differences (DiD) equations on subgroups of your data and obtain the Average Treatment Effect on the Treated (ATT) within each subgroup. I imagine you want to estimate separate ATTs using the 'partitioned' data and make causal statements about the effect of your international law/policy. If so, and you're using DiD methods, then a subgroup analysis does not preclude a demonstration of parallel paths before the event.

Evaluators may partition their data into subgroups for many reasons. In a DiD context, we may suspect unit-specific treatment effects. The effect of some international policy may vary widely by, say, geography. A single DiD coefficient that averages across all unit-specific treatment effects may obscure a strong effect endemic to a particular region (e.g., state/province/continent). Estimating separate models, by region, does not do away with the identifying assumptions of a traditional DiD analysis.

Another reason for working with subgroups is due to the timing of a treatment. In other words, you may want to partition your data into different 'timing groups' reflecting the observed onset of treatment. Think about this as estimating a separate ATTs across cohorts treated at different points in time. This is one way to avoid the 'negative weighting' problem imposed by a policy that's rolled out in waves. For example, say you observe early-adopter, late-adopter, and never-adopter countries. You could estimate separate ATTs using early- versus never-adopter (or not-yet-adopter) countries and late- versus never-adopter countries, ignoring the comparisons where "already treated" countries serve as counterfactuals for the "soon-to-be-treated" countries. One reason for ignoring the last comparison is because the "already treated" (i.e., early adopters) may respond dynamically to a policy change, potentially offsetting their parallel path with a cohort treated downstream. Note, the 'generalized' DiD estimator assumes a parallel outcome trajectory across all the cohort-specific comparisons. Callaway and Sant'Anna (2020) actually do this sort of task, whereby they aggregate all the different parameters of interest to form summary measures of causal effects. I encourage you to review this working paper for more on this topic.

In short, homing in on a particular region (e.g., continent) or timing group (i.e., early-/late-adopter cohort) doesn't rule out assumptions regarding parallel pre-trends. There's no free lunch.

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33