6

I want to estimate a staggered difference-in-difference (DD) with continuous treatment. The data looks something like this:

$$ \begin{array}{ccc} Individual & year & CT_{i,t} \\ \hline 1 & 2000 & 0 \\ 1 & 2001 & 0 \\ 1 & 2002 & 0 \\ 1 & 2003 & 0 \\ 1 & 2004 & 0.3 \\ 1 & 2005 & 0.4 \\ 1 & 2006 & 0.42 \\ 1 & 2007 & 0.2 \\ 1 & 2008 & 0 \\ 1 & 2009 & 0 \\ \hline 2 & 2000 & 0 \\ 2 & 2001 & 0 \\ 2 & 2002 & 0 \\ 2 & 2003 & 0 \\ 2 & 2004 & 0 \\ 2 & 2005 & 0 \\ 2 & 2006 & 0 \\ 2 & 2007 & 0 \\ 2 & 2008 & 0 \\ 2 & 2009 & 0 \\ \hline 3 & 2000 & 0.1 \\ 3 & 2001 & 0.1 \\ 3 & 2002 & 0.1 \\ 3 & 2003 & 0.5 \\ 3 & 2004 & 0.6 \\ 3 & 2005 & 0.4 \\ 3 & 2006 & 0.2 \\ 3 & 2007 & 0.1 \\ 3 & 2008 & 0.3 \\ 3 & 2009 & 0.1 \\ \hline 4 & 2000 & 0.3 \\ 4 & 2001 & 0.2 \\ 4 & 2002 & 0.4 \\ 4 & 2003 & 0.2 \\ 4 & 2004 & 0.3 \\ 4 & 2005 & 0.5 \\ 4 & 2006 & 0.1 \\ 4 & 2007 & 0.12 \\ 4 & 2008 & 0.13 \\ 4 & 2009 & 0.14 \\ \hline \end{array} $$

The generalized DD equation can be specified as follows:

$$ y_{i,t} = \gamma_i + \lambda_t + \delta CT_{i,t} + \epsilon_{i, t}, \cdots (1) $$

where $i$ denotes some individual and $t$ for year. $\gamma_i$ are individual fixed effects, and $\lambda_t$ are year fixed effects. $CT_{i,t}$ is a continuous treatment variable that measures individual $i$'s exposure to some "shock" in year $t$. Each of the individuals only experiences one treatment, i.e., individual 1 in 2004, individual 2 never, individual 3 in 2003, and individual 4 in 2006. For example:

Consider individual 1, his exposure to the shock is 0 until treatment occurs in year 2004, where his exposure to the shock has an intensity of 0.3. In the year after treatment, his exposure becomes 0.4, then 0.42, then 0.2 and dies out in 2008.

Individual 2 is "never treated".

Individual 3 has a constant exposure until he becomes treated in 2003, where his exposure jumps to 0.5. As a result of this treatment, his exposure then fluctuates around until the end of the sample period 2009.

Individual 4 has a fluctuating exposure until he becomes treated in 2006, where his exposure falls to 0.1, then fluctuates until the end of the sample period 2009.

My first question is whether Equation (1) is an appropriate generalized DD equation that I can use to estimate the "treatment" effect?

My second question is how can I estimate a dynamic period by period coefficient version of Equation (1)? For example, in the usual case where treatment is staggered but binary (and where the treatment variable is 0 in the "pre treatment" period), one can easily estimate a dynamic version by using period by period dummy variables such as shown here. However, how can I do that here? The continuous treatment variable (CT) is not always 0 in the pre-treatment period, nor does it take on a constant value post-treatment either.

EDIT: Some more information on the "treatment". Each treatment is the introduction of a new regulation. Each individual is treated based on how much he spends in a given year. If he spends more, his intensity of treatment (i.e., exposure to the regulatory shock) is higher, if he spends less, his intensity of treatment (i.e., exposure to the regulatory shock) is less, CT is bounded between 0 and 1. The first regulation occurred before the start of the sample period in 1992. This affected ALL individuals at the same time, but then after 1992, for each individual, a "newer" version of the regulation came into effect, but the introduction is staggered for each individual. The difference between the "newer" regulation and the initial one in 1992 is that the amount of money one spends translates into a different amount of treatment intensity. For example, if someone spends \$1 under the 1992 regulation, then the treatment intensity, say, takes a value of 0.1, but under the newer regulation, \$1 may translate into only 0.01 (these are just hypothetical values I made up to illustrate the difference in the regulation). Let me explain in detail how CT varies for each individual:

For individual 1, under the 1992 regulation, he spends nothing in 2000, 2001, 2002, and 2003, thus his CT is 0. For him, the new regulation is enforced in 2004, he happens to spend some money in 2004, spends a different amount of money in 2005, etc. That's why his CT fluctuates from 2004 to 2007. He spends nothing in 2008 and 2009, so his CT is 0.

Individual 2 never spends anything throughout the entire sample period, so his CT is always 0.

Individual 3 spends a constant amount of money in each of the years 2000, 2001, and 2002, so under the 1992 regulation, his CT is always 0.1. It does not fluctuate because he spends the same amount in each of these three years. But the newer regulation comes into effect for him in 2003. He spends varying amounts of money until the end of the sample period, that's why his CT fluctuates after 2003.

Individual 4 spends a varying amount of money each year from 2000 to 2005. Under the 1992 regulation, his CT fluctuates around. But the newer regulation for him comes into effect in 2006. Again he spends a varying amount of money until the end of the sample period, so his CT fluctuates until 2009.

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33
TeTs
  • 889
  • Is individual 4 experiencing some exposure to the treatment given the fluctuation in values pre-2006? Why do some have absolutely 0 exposure pre-event, while others jumps around? I suppose with a little more knowledge of this “shock” I can offer better guidance. – Thomas Bilach Mar 24 '23 at 13:06
  • Thanks @ThomasBilach! I edited my original post to include more information on this treatment. I would love to hear your opinion on this given your expertise in diff in diff! – TeTs Mar 25 '23 at 07:05
  • What is your effect window? Since you want to investigate some anticipatory or delayed response to your treatment, how many of these period-specific effects do you want? – Thomas Bilach Apr 03 '23 at 22:53
  • I would like a window of 4 years before to 4 years after the treatment. – TeTs Apr 05 '23 at 03:53

1 Answers1

1

To be clear, my response is assuming you have a single event of what appears to be a known and varying treatment intensity. In some applications, say a series of minimum wage hikes or exposure to different concentrations of particulate matter in the air, we can treat the increases/decreases over time as multiple events, especially when we're dealing with multiple level changes over time. It doesn't appear your data fits this pattern entirely, but there are some similarities. You know the event years and the precise exposure post-event. Thus, you're exploiting variation in treatment timing and intensity within and across individuals.

My first question is whether Equation (1) is an appropriate generalized DD equation that I can use to estimate the "treatment" effect?

Yes.

The variable $CT_{it}$ is a policy variable and represents a treatment status change. Include it as you would any other variable.

My second question is how can I estimate a dynamic period by period coefficient version of Equation (1)?

Assuming you have a lot of observations with treatment histories as defined above, then individuals $i$ may experience no exposure, constant exposure, and/or fluctuating exposure over time, even before the event starts. For some units, the exposure isn't even permanent; it appears the intensity "dies out" as you suggest with unit 1 so long as spending goes to zero. In the extreme case as with unit 4, the variable is a constantly changing continuous variable pre- and post-event.

To achieve "time-varying" effects, you need to substitute different time configurations into the model. When we lead and/or lag in a setting like this, we a lose a period. Depending upon the context, we can get away with replacing a missing value with 0, but this is only justified in settings where we know a priori that individuals do not have any treatment intensity before the policy (or after it ends); they are essentially untreated pre-event.

In a setting with a continuous policy variable (i.e., constantly changing numeric variable), here is one way to go about estimating time-varying effects:

$$ \begin{array}{ccc} i & t & CT_{it} & start & CT_{i,t+2} & CT_{i,t+1} & CT_{it} & CT_{i,t-1} & CT_{i,t-2} \\ \hline 1 & 2000 & 0 & 2004 & 0 & 0 & 0 & \text{NA} & \text{NA} \\ 1 & 2001 & 0 & 2004 & 0 & 0 & 0 & 0 & \text{NA} \\ 1 & 2002 & 0 & 2004 & 0.3 & 0 & 0 & 0 & 0 \\ 1 & 2003 & 0 & 2004 & 0.4 & 0.3 & 0 & 0 & 0 \\ 1 & 2004 & 0.3 & 2004 & 0.42 & 0.4 & 0.3 & 0 & 0 \\ 1 & 2005 & 0.4 & 2004 & 0.2 & 0.42 & 0.4 & 0.3 & 0 \\ 1 & 2006 & 0.42 & 2004 & 0 & 0.2 & 0.42 & 0.4 & 0.3 \\ 1 & 2007 & 0.2 & 2004 & 0 & 0 & 0.2 & 0.42 & 0.4 \\ 1 & 2008 & 0 & 2004 & \text{NA} & 0 & 0 & 0.2 & 0.42 \\ 1 & 2009 & 0 & 2004 & \text{NA} & \text{NA} & 0 & 0 & 0.2 \\ \hline 2 & 2000 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2001 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2002 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2003 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2004 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2005 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2006 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2007 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2008 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ 2 & 2009 & 0 & \text{Inf} & 0 & 0 & 0 & 0 & 0 \\ \hline 3 & 2000 & 0.1 & 2003 & 0.1 & 0.1 & 0.1 & \text{NA} & \text{NA} \\ 3 & 2001 & 0.1 & 2003 & 0.5 & 0.1 & 0.1 & 0.1 & \text{NA} \\ 3 & 2002 & 0.1 & 2003 & 0.6 & 0.5 & 0.1 & 0.1 & 0.1 \\ 3 & 2003 & 0.5 & 2003 & 0.4 & 0.6 & 0.5 & 0.1 & 0.1 \\ 3 & 2004 & 0.6 & 2003 & 0.2 & 0.4 & 0.6 & 0.5 & 0.1 \\ 3 & 2005 & 0.4 & 2003 & 0.1 & 0.2 & 0.4 & 0.6 & 0.5 \\ 3 & 2006 & 0.2 & 2003 & 0.3 & 0.1 & 0.2 & 0.4 & 0.6 \\ 3 & 2007 & 0.1 & 2003 & 0.1 & 0.3 & 0.1 & 0.2 & 0.4 \\ 3 & 2008 & 0.3 & 2003 & \text{NA} & 0.1 & 0.3 & 0.1 & 0.2 \\ 3 & 2009 & 0.1 & 2003 & \text{NA} & \text{NA} & 0.1 & 0.3 & 0.1 \\ \hline 4 & 2000 & 0.3 & 2006 & 0.4 & 0.2 & 0.3 & \text{NA} & \text{NA} \\ 4 & 2001 & 0.2 & 2006 & 0.2 & 0.4 & 0.2 & 0.3 & \text{NA} \\ 4 & 2002 & 0.4 & 2006 & 0.3 & 0.2 & 0.4 & 0.2 & 0.3 \\ 4 & 2003 & 0.2 & 2006 & 0.5 & 0.3 & 0.2 & 0.4 & 0.2 \\ 4 & 2004 & 0.3 & 2006 & 0.1 & 0.5 & 0.3 & 0.2 & 0.4 \\ 4 & 2005 & 0.5 & 2006 & 0.12 & 0.1 & 0.5 & 0.3 & 0.2 \\ 4 & 2006 & 0.1 & 2006 & 0.13 & 0.12 & 0.1 & 0.5 & 0.3 \\ 4 & 2007 & 0.12 & 2006 & 0.14 & 0.13 & 0.12 & 0.1 & 0.5 \\ 4 & 2008 & 0.13 & 2006 & \text{NA} & 0.14 & 0.13 & 0.12 & 0.1 \\ 4 & 2009 & 0.14 & 2006 & \text{NA} & \text{NA} & 0.14 & 0.13 & 0.12 \\ \hline \end{array} $$

Please note the endpoints. If you do not know individual spending beyond these limits, then they should be treated as missing (i.e., $\text{NA}$ = "Not Available"). With more and more leads and/or lags, a researcher will either restrict the effect window by either binning at the last estimated lead/lag or drop unit-time observations beyond the effect window. By "binning" we assume constant treatment effects beyond the effect window in one, or both, directions. For example, in the case with a binary treatment variable, a final binned lag just changes from 0 to 1 in that period then stays equal to 1 for the remainder of the panel. In the continuous case, a binned lag is forward cumulated (e.g., $CT_{it} = CT_{it} + \Delta CT_{i, t-1}$). I rarely observe researchers explain how they treat the end points in their papers, especially in cases with continuous policy variables, so I can't even direct you to a good resource.

In most event study applications, it's not uncommon to simply ignore the estimated effects beyond the effect window. If you're working with a panel that is wider than it is long, then I would consider a shorter effect window. On the other hand, if you're working with a much longer time series, then losing a few periods at either endpoint isn't going to matter much.

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33
  • Thank you so much for this insightful answer, you have no idea how helpful it is. I have two follow up questions: (1) For simplicity, say I just look at a 2 year window (ignore binning), and I want to plot the coefficients of the time-varying effects for each of the two years before and after relative to the event, i.e., the $\delta$ coefficients given in your answer here: https://stats.stackexchange.com/questions/526787/how-to-plot-the-graph-or-perform-a-formal-test-of-parallel-trends-for-generalize?rq=1. Is the effect for the two year prior to the event (i.e., the $\delta_{-2}$) ... (cont.) – TeTs Apr 18 '23 at 23:54
  • given by the coefficient on $CT_{i, t+2}$ or $CT_{i, t-2}$? Maybe I am confused, but the $CT_{i, t+2}$ you provided here seems like the continuous generalization of the $d_{kt}^{-2}$ binary variable given in your previous answer. Is this just notational differences? If I understand correctly, the coefficient for the effect for period -2 should be given by $CT_{i, t+2}$, and the effect for the period -1 should be given by $CT_{i, t+1}$, right? Also, usually in the binary case, we leave out period -1 as the "baseline omitted case", but here do we omit $CT_{i, t+1}$? We shouldn't right? – TeTs Apr 19 '23 at 00:02
  • question (2): you mentioned "exposure to different concentrations of particulate matter in the air, we can treat the increases/decreases over time as multiple events". Let's assume now there is no "single" event and that my data represents an individual exposure to PM concentrations for a given year. Since there is no "precise" event date, because a person's exposure to PM just fluctuates yearly, how would I estimate a time-varying effects specification now? What would be the analogous table you provided above for this "multiple" events version? Or, do I need to define something... (cont) – TeTs Apr 19 '23 at 00:08
  • as the event first? For example, if the absolute value between two consecutive years PM exposure is above a certain threshold, that is considered "treatment", then I construct a similar table as you did above using these defined events? – TeTs Apr 19 '23 at 00:09
  • The leads/lags assess anticipatory/delayed effects. Yes, these are notational differences. I don't presume you're going to saturate the model with leads and lags. With a limited effect window, you don't have to drop the first lead. – Thomas Bilach Apr 19 '23 at 00:10
  • In a setting with a constantly changing continuous policy variable where each increase/decrease represents multiple shocks/events, then you should proceed as recommended. On the other hand, sometimes you're faced with a single event and the continuous policy variable varies only across units $i$. In my previous answer, the time-varying effects (event study variables) for, say, Belgium would equal the intensity experienced in that year. For example, say for individual $i$ a single shock of intensity 0.5 hits Belgium. In that case, replace all the 1's with 0.5, 0 otherwise. – Thomas Bilach Apr 19 '23 at 00:24
  • Note how I assume the changes represent multiple events. As a solution to real world applications where treatments occur repeatedly and are of different intensities across units and/or time, you could dichotomize the treatment variable and use a dummy that "switches on" for only very large policy changes. It's a bit arbitrary in my estimation, but it can work. – Thomas Bilach Apr 19 '23 at 00:29
  • Thank you. When you say "saturate the model with leads and lags", isn't estimating the time-varying version with $CT_{i,t+2}$, $CT_{i,t+1}$, etc already saturating the model with leads and lags? Do you mean something else? Could you show me? – TeTs Apr 19 '23 at 00:39
  • Saturating a model means including all possible leads and lags of the policy variable. In a case with a dichotomous treatment variable, you can't include all years; we need to omit one year as a reference. – Thomas Bilach Apr 19 '23 at 00:43
  • Interestingly, check out Section 3.2 of this discussion paper. It closely resembles your setting. Try to focus on the event study matrices. – Thomas Bilach Apr 19 '23 at 00:46
  • Thank you for the reference! For a binary treatment variable, the fully saturated model is just like your previous answer. But in my case here, fully saturating the model is impossible right? (because of the constantly changing continuous treatment). – TeTs Apr 19 '23 at 00:48
  • Take unit 1, for example. Now let's say we don't know individual 1's spending patterns before 2000. How many missing values would appear by the fifth lag? Similarly, how many missing values would appear by the fourth lead? In short, you wouldn't have enough data to estimate your treatment effect. – Thomas Bilach Apr 19 '23 at 02:20
  • @ThomasBilach, saw this answer of yours after you commented on my recent question. You mentioned the example of minimum wage hikes over the years at the beginning. Does this also belong to the realm of DiD with continuous treatment? – funcard Aug 17 '23 at 04:22
  • @funcard Yes, it is appropriate to model the minimum wage changes over time with a continuous policy variable. – Thomas Bilach Aug 17 '23 at 07:23
  • @ThomasBilach, thanks. Trying to figure out how to deal with treatments that coincide across the years. For example, some units will have repeated minimum wage changes across the years. So, it's not just that units have different levels of treatment, but might have the treatment of same or different magnitude year after year. Could you shed some more light on how to have credible identification when treatments coincide? Not sure if I should ask a separate question instead of commenting here... – funcard Aug 17 '23 at 20:05