So this has really been bothering me and I was hoping for a (simple!) explanation if possible.
Suppose I've specified a linear regression model: $$ Y = \beta_0 + \beta_1 X + \epsilon $$ And an alternative: $$ Y = \beta_0 + \beta_1 X + \beta_2 X^2 + \epsilon $$ And I'm trying to estimate the $\beta$s, say perhaps through OLS (the exact method I don't think is relevant).
My question is: what is the exact interpretation of the $\beta$s I am trying to estimate?
The confusion arises from the fact that the population values of $\beta_1$ in either specification are presumably different, and this doesn't couch with my understanding of the population coefficients.
I had always interpreted the $\beta$s as the partial derivative of $X$ on $Y$ 'in reality'. That is, if you were to change X holding other regressors constant the change in the expected value of Y. By providing a better and better specified model, one ensured that the estimate of $\beta_1$ became more accurate (by separating out correlated variables in the error term).
This was important to my understanding; $\beta_1$ was not contingent on the specification of my model -it remained an invariant feature of the population - but rather the estimator we had for $\beta_1$ (b1) changed and became more or less accurate depending on the model.
All well and good, but this interpretation doesn't quite work in the example above. Suppose that the relationship between $X$ and $Y$ is curvilinear. If you were restricted to only include $X$ and not any higher order polynomials, then presumably the $\beta_1$ that would best describe the change in $E[Y]$ given a change in $X$ would be different than if you were to allow for higher order polynomials (in specification 2).
So say, for arguments sake, the DGP was $$ E[Y] = 1 + 10 X - 2 X^2 $$ where $0<X<2$ to ensure the polynomial doesn't influence too heavily. In this case should the true value of $X$ in specification 1 be 10? Or, to fit it to that DGP when $X^2$ is not specified should it be ~6?
It seems if it is the latter my understanding that the population coefficients do not depend on the specification go up in smoke! Please help!
