1

Recently, I run the regression for the generalised DID following this paper:

$Y_{it}$ = $\alpha$ + $\beta$ $(Leniency Law)_{kt}$ + $\delta$$X_{ikt}$ + $\theta$$_t$ + $\gamma$$_i$ +$\epsilon$$_{it}$ (1)

I accidentally ran the regression without intercept ($\alpha$), and my senior friend told me that it is really dangerous when running such an equation without intercept if there is no good literature backup. I am wondering why it is so dangerous in this case?

Update: Adding the output of regressing for collinearity suspicion from @chan1142

enter image description here

Phil Nguyen
  • 1,130
  • 3
  • 10
  • 27
  • 3
    Did you use xtreg, fe? Or, what was your command (stata or R)? Please explain exactly how you ran the regression without $\alpha$. – chan1142 Jun 08 '21 at 07:43
  • 1
    @chan1142 yes, it should be xtreg y x1 x2 i.year, fe or areg y x1 x2 i.year, a(type) where type is firm identification – Phil Nguyen Jun 08 '21 at 07:47
  • 2
    If you ran xtreg, fe, it's not an issue. You've done it correctly. (In your model, $\alpha$ and $\gamma_i$ are not separately identified and Stata reports $\hat\alpha$ as the sample mean of the estimates of $\alpha+\gamma_i$. Stata does everything for you correctly.) – chan1142 Jun 08 '21 at 07:52
  • I am wondering if it is the case. Because when I run the regression xtreg y x1 x2 i.year, fe , the result did not show me the intercept, that is why I am so reluctant – Phil Nguyen Jun 08 '21 at 07:57
  • 1
    Then, I would first worry about collinearity among the explanatory variables. There can be some strange things involved. I wonder if you are willing to show the regression outputs (Stata outputs). – chan1142 Jun 08 '21 at 08:02
  • @chan1142 I did add the result, thank you so much for having a look on that, much appreciated – Phil Nguyen Jun 08 '21 at 08:05
  • 1
    Thanks. I see the _cons row. That's the intercept; the dummy variable for the reference group (probably TYPE2 = 1) is omitted in order to avoid the dummy variable trap. That said, I see that pri_ove_ern is almost omitted. I would look into that variable. – chan1142 Jun 08 '21 at 08:09
  • @chan1142 thank you so much for your spot out, can I ask why you say pri_ove_ern is almost omitted, is it because the very low coefficient? Much appreciated – Phil Nguyen Jun 08 '21 at 08:11
  • 1
    Yes. The estimate and the standard error are very small. If they are small because the scale is huge (you can see it by reading its standard deviation), that's OK, but I would divide the variable by, for example 10^12 or similar so that the reported numbers are nice. – chan1142 Jun 08 '21 at 08:16
  • @chan1142 what a hands-on suggestion, so, normally we will have a look at coefficient and standard error to have a suspicion for multicollinearity, isn't it? And I am sorry that I did not catch the idea of "divide the variable by, for example 10^12" I am wondering if you could please help me to clarify it more? – Phil Nguyen Jun 08 '21 at 08:20
  • 2
    You change the unit of measurement, by which you change the scale of the coefficient. It’s only a matter of cosmetics. – chan1142 Jun 08 '21 at 08:27
  • I see, thank you so much for your help and suggestion so far – Phil Nguyen Jun 08 '21 at 08:27
  • 3
    @chan1142 you could make your comments into an answer. – Jesper Hybel Jun 08 '21 at 08:47

1 Answers1

4
  1. The Stata outputs say that you did not omit the intercept. No worries about the intercept. The _cons row is for the intercept.

  2. As you know, Stat's xtreg ..., fe will give you identical coefficient estimates except the intercept. The constant terms are different between areg and xtreg. That's because areg imposes the constraint that $\gamma_1 = 0$ and xtreg, fe that $n^{-1} \sum_{i=1}^n \gamma_i = 0$, which is to say that areg's $\alpha$ is $\alpha_1$ and xtreg, fe's $\alpha$ is $n^{-1} \sum_{i=1}^n \alpha_i$, where $\alpha_i = \alpha + \gamma_i$.

  3. The statistics for the pri_ove_run variable are not displayed in a pleasant way; $1.04\times 10^{-12}$ and $5.17\times 10^{-12}$ are hard to read. You can divide the variable by, e.g., $10^{12}$ (which means the unit is upscaled by the factor of one trillion - does it make sense?). Then the reported estimate will be 1.04XXXX and the reported standard error will be 5.17XXXX, which look nicer. Though it's only a matter of cosmetics, I believe it important too. BTW, at first I thought that there might be collinearity issues involved, but now I believe it's only about unit of measurement (because the $t$ value is reasonable); still, it would be a good idea to investigate into the variable in your data set.

  4. (On excluding the intercept) In general, it is not a good idea to exclude the intercept by using Stata's nocons option unless you know what you are doing (see 6 below for an example). For the model $y=\beta_0 + \beta_1 x_1 + \beta_2 x_2 + u$ (with all good assumptions), for example, if you do reg y x1 x2, nocons, it means you are imposing the constraint that $\beta_0=0$, which means that $E(y)=\beta_1 E(x_1) + \beta_2 E(x_2)$. If everyone agrees that $E(y)=\beta_1 E(x_1) + \beta_2 E(x_2)$ (e.g., you know that $E(y)= E(x_1)= E(x_2)=0$), this constraint is acceptable, but otherwise your regression will yield an inconsistent slope estimator. Even if $\beta_0$ is really 0, the gain from imposing the constraint (i.e., from excluding the intercept) is usually trivial.

  5. Things can be quite complicated for models with many dummy variables (and interaction terms). In your model, you have the period dummies and the individual fixed effects. If some of $X$ variables show little variability across $i$ or over $t$, then strange things can happen. We should always watch out for collinearity, though it is often alright anyway.

  6. There are some cases you should exclude the intercept. For example, for the panel model $y_{it} = \alpha_i + \beta x_{it} + u_{it}$ (without period dummies), the first difference (FD) regression is OLS of the model $\Delta y_{it} = \beta \Delta x_{it} + \Delta u_{it}$, where $\Delta y_{it} = y_{it} - y_{it-1}$, etc., and $\Delta \alpha_i = 0$. The differenced equation has no intercept, and thus you should do reg d.y d.x, nocons vce(cluster TYPE2) for the FD regression. If you omit the nocons option, it means your original model contains a linear time trend (yes, a linear trend, not time dummies)! This is also related with Anderson and Hsiao's instrumental variable estimation of dynamic panel data models.

chan1142
  • 2,114
  • 8
  • 17
  • Thank you so much, @chan1142. Can I ask a couple of questions? (1). What are $\gamma_1$ and $\alpha_1$ in your second bullet then? – Phil Nguyen Jun 11 '21 at 02:49
  • 1
    In your model, the individual effects are $\alpha + \gamma_i$ (your notations). $\gamma_1$ is for $i=1$, and $\alpha_i$ is defined as $\alpha + \gamma_i$ so $\alpha_1 = \alpha + \gamma_1$. Here I presumed that the reference case is $i=1$. – chan1142 Jun 11 '21 at 07:10
  • @chan1141, from your experience, whether including nocons will change coefficients of other independent variables ? – Phil Nguyen Jun 17 '21 at 22:28
  • 1
    It depends on how other dummy variables are handled by the nocons option. Without nocons, i.yr includes dummy variables except for the reference year. With nocons, if all year dummies are included (including one for the reference year), then $\beta$ won't change. But if the intercept is excluded and still the dummy for the base year is excluded, then the results can change. In your model there are also individual fixed effects (a(TYPE2)) and it all depends on how Stata handles the nocons option. You can find it out by running the command. – chan1142 Jun 18 '21 at 10:52