3

in a simple linear regression model $$y=\beta _1 + \beta _2 x + \epsilon$$

We define $x$ to be exogenous if $$E(\epsilon|x)=0$$

I am a bit puzzled as to why this term is called "exogenous", which I intuitively understand to mean something like "not causally influenced by the other variables in the model, including $\epsilon$, since it clearly does not mean that $\epsilon$ and $x$ are independent. We can have for example, that the variance of $\epsilon$ is a function of $x$, or that $\epsilon$ has a t-distribution with $x$ degrees of freedom, or something weird like that.

So what is the intuitive reason that $x$ is called exogenous if that condition applies, rather than independence?

user56834
  • 2,809

1 Answers1

2

You are right, and you are touching the achilles heel of most econometrics textbooks: the conflation of causal and statistical concepts. $E[\epsilon|x] = 0$ , by itself, formally is just mean independence. And in fact, any regression error will be mean independent of the covariates. That is, any variable $Y$ can be decomposed as $Y = E[Y|X] + \epsilon$ where $E[\epsilon|X] =0$.

Thus, the concept of exogeneity has to go beyond associational ideas, such as mean independence --- it makes sense in a structural framework, where the parameters you want to estimate mean something more than merely a description of the observed distribution. And your intuition about the causal content is on the right track.

The general definition of exogeneity of $X$ is that $X$ is independent of all other unobserved factors that causes $Y$ except those mediated by $X$ itself -- $\epsilon$ is just a shortcut to represent all those unmodeled factors. In a linear structural equation setting, however, this is usually relaxed to claiming only that the mean of $\epsilon$ is independent of $X$, which is weaker but enough to identify the structural coefficient $\beta$ in that setting.

  • The assumption/definition $\mathbb{E}(\epsilon|x)=0$ is on the random variables $\epsilon$ and $x$, not residuals $\hat\varepsilon$ and sample values of $x$. Conflation of the two might confuse the readers. – Richard Hardy Nov 06 '18 at 08:39
  • @RichardHardy I’m not talking about samples, we can always define the random variable $\epsilon := y - E[y|x]$, with the population values (not sample values), and $\epsilon$ will always be mean independent of $x$. That’s why for that assumption to be meaningful it needs to make reference to something beyond the observed joint probability distribution, such as $\epsilon := y- E[y|do(x)]$. – Carlos Cinelli Nov 06 '18 at 16:12
  • Thank you for the elaboration. What I am trying to say is that residuals are sample realizations of errors/disturbances/innovations/shocks/..., so it makes sense to talk about depence between $\epsilon$ and $x$ but not $\hat\epsilon$ and $x$. Hence, I have a problem understanding what exactly you mean by any regression residual will be mean independent of the covariates. Other than that I find your points enlightening. – Richard Hardy Nov 06 '18 at 16:41
  • @RichardHardy ok, I will change it to regression error instead of residual to make it clear it is a population quantity! – Carlos Cinelli Nov 06 '18 at 16:42