3

When you want to use the IV (instrumental variable) estimator, you typically first test if you have a strong instrument.

You do so by regressing the (endogenous) predictor against the instrument. With the regression coefficient, you can calculate the F statistic.

My question is: do you have to include an intercept in this regression model? In all the formulas I find, there is no intercept included.

But this would not be meaningful? Below is the regression of my endogenous variable on the instrument, with no intercept, which leads to a wrong conclusion about the regression coefficient?

Thanks in advance.

enter image description here

Kasper
  • 3,399

1 Answers1

5

It is rarely advisable to exclude the intercept unless you have very strong theoretical reasons for doing so. If you perform your first stage regression of the endogenous variable $y_{2}$ on the instrument vector $Z$, $$y_{2} = \alpha + Z' \beta + \epsilon$$ and you omit the constant $\alpha$, your coefficient estimate for the instruments will be $$ \begin{align} E[\beta] &= E[(Z'Z)^{-1}Z'y_{2}] \newline &=E[(Z'Z)^{-1}Z'(\alpha + \beta Z + \epsilon_i)] \newline &= E[(Z'Z)^{-1}Z'\alpha + (Z'Z)^{-1}Z'Z\beta + (Z'Z)^{-1}Z'\epsilon] \newline &= E[(Z'Z)^{-1}Z'\alpha] + \beta \end{align} $$ where the term $(Z'Z)^{-1}Z'\epsilon$ vanishes because $E[Z'\epsilon]=0$ by assumption. Hence your $\beta$ will be biased if $\alpha \neq 0$, which is even true if $\alpha$ is not significantly different from zero. In this case your F-test on the excluded instruments will equally be false.

What is mostly the case in books and articles is that if they specify a first stage like $$y_2 = X'\beta + Z'\pi + \nu$$ it is implicitly assumed that the vector of covariates $X$ includes a constant, i.e. $X = (\alpha, x_1, x_2,..., x_k)$. So long story short: there should be an intercept in both the first and second stage (as well as the $X$ should be the same in both first and second stage, I just omitted them from the bias proof above for simplicity).

Andy
  • 19,098