3

How does the interpretation of the coefficient on a dummy variable change in settings where the indicator is switching 'on' and 'off' repeatedly over time? Suppose I have complaint data on hotels $i$ over days $t$. Some of the hotels employ guards, while others do not. Interestingly, security is only present in and around the treated facilities two days per week. The other hotels have no coverage at all.

The model is taking the following form:

$$ \text{log}(y_{it}) = \beta d_{it} + \gamma_i + \lambda_t + u_{it}, $$

where $d_{it}$ is equal to 1 if a hotel is assigned guards and only during the days when they are actually surveilling the location. The parameters $\gamma_i$ and $\lambda_t$ represent hotel and day fixed effects, respectively. This framework mirrors the more general difference-in-differences estimator.

Here is what I know about the interpretation of a dummy variable when the outcome is log-transformed:

  • If $d_{it}$ switches from 0 to 1, the % impact of $d_{it}$ on the outcome is: $100 \times (\exp(\hat{\beta})-1)$
  • If $d_{it}$ switches from 1 to 0, the % impact of $d_{it}$ on the outcome is: $100 \times (\exp(-\hat{\beta})-1)$

But the intervention dummy is switching back and forth somewhat arbitrarily. Hotels receive this 'on' and 'off' coverage over the entire year. If this helps, see the abridged data frame below (only hotel 2 receives intermittent coverage):

$$ \begin{array}{ccc} hotel & day & d_{it} \\ \hline 1 & 1 & 0 \\ 1 & 2 & 0 \\ 1 & 3 & 0 \\ 1 & 4 & 0 \\ 1 & 5 & 0 \\ 1 & 6 & 0 \\ 1 & 7 & 0 \\ \hline 2 & 1 & 0 \\ 2 & 2 & 0 \\ 2 & 3 & 0 \\ 2 & 4 & 1 \\ 2 & 5 & 0 \\ 2 & 6 & 0 \\ 2 & 7 & 1 \\ \end{array} $$

This estimator is often described as averaging all the two-by-two difference-in-differences estimators. It is my understanding that the estimate of $\beta$ is the average of all the sub-estimators when there is a change from 0 to 1.

Question: Does the "percentage" interpretation need to account for hotels moving in and out of the treated condition repeatedly? As shown, hotels 'switch out' of the treated condition multiple times.

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33

1 Answers1

1

This is a complicated question. I will side-step how well DID works in a setting with both staggered/differential timing and transient/spikey/pulsing/non-absorbing/reversible treatments. There is a nice summary of this growing literature here. The short summary is that the weights that you mention can get really weird and even turn negative.

Another relevant paper is Imai, Kim, and Wang (2018), where democracy and war are the treatments.

Now for your main question. I think the reason you have asymmetry here is that percent changes are not symmetric. Say $\hat \beta_D = 0.6433$. That means when D goes from 0 to 1 you would expect $$\exp(0.6443)-1= .9047,$$ which is $+90\%$. So if the baseline was 50 incidents per month, you would now expect 95. Now suppose D goes from 1 to 0. Then $$95 \cdot (\exp(-0.6443)-1)=95 \cdot (-.47497) = -45,$$ so that gets us back down to 50 from 95. The percent change factor is different in magnitude from before since the baseline is now higher.

You can also reduce small sample bias by calculating $$\exp (\beta_D - \frac{1}{2}\hat \sigma^2_{\hat \beta_D})-1.$$

dimitriy
  • 35,430
  • The paper by Imai and colleagues (2018) is a great read and actually inspired this question. So is it enough to interpret $\hat{\beta}$ as returning the average effect of guardianship on complaints? This is assuming I’m only interested in the days when treatment ‘switches on’ over time, which happens repeatedly. – Thomas Bilach May 07 '21 at 03:45
  • Or, should the $d_{it}$ column be something like “guarded” versus “unguarded” and then I could adjust the reference category? Is this a way to get one estimate of switching into treatment and another for switching out of treatment? – Thomas Bilach May 07 '21 at 05:33
  • It’s a weighted average, I think, with possibly very odd weights. The other suggestion sounds like assuming an absorbing treatment with dynamic effects, which is a reasonable way to model, but still has the same issues. – dimitriy May 07 '21 at 05:37
  • Thank you. And lastly, in Section E on page 49 of Imai's work, they explain the coding of what I believe are two policy dummies. One estimates the effect of democratization while the other estimates the effect of authoritarian reversal. Is it, in fact, two policy dummies? And if so, how would they be coded? Using the makeshift data frame I provided would be helpful as I don't think the paper explained the coding of the dummies very well. I know this requires a thorough read of the paper so if you can't provided an adequate response here I understand. – Thomas Bilach May 13 '21 at 02:48
  • This is explained better in the Appendix of the Acemoglu et al paper that Imai et al are replicating. See pages A9-A10 and eq (2). There are two variables, democratization and reversal. Both start at zero, and then increment by one each time there is an event of that type. In your case, hotel 1 would have zeros for both in every period. For hotel 2, D={0,0,0,1,1,1,1} and R={0,0,0,0,0,0,0,1}. – dimitriy May 14 '21 at 04:32
  • The example helps a lot. Shouldn't R (i.e., reversal) 'turn on' in the off periods as well? In my fake example, isn't a reversal the removal of guardianship (i.e., equal to 1 in days 5 and 6)? I actually coded it in the same way you did, but I don't understand why the reversal is coded 1 by the second event, when it is, technically, the second time the intervention is in place. Is my intuition valid? – Thomas Bilach May 15 '21 at 01:02
  • You are absolutely right: R={0,0,0,0,0,1,1,1}. I think I did it as dR instead of R for some inexplicable reason. – dimitriy May 15 '21 at 02:00