3

Im doing some research with some panel data I have on firm output and investment.

I ran two equations.

  1. $$y=\beta_0+\beta_1x+\beta_2x^2+\mu$$
  2. $$\Delta y=\alpha_0+\Delta\alpha_1x+\epsilon$$

in R these equations were prodouced using these commands

   1) lm(output~investment+I(investment^2))
   2) lm(diff(output)~diff(investment))

The first equation had statistical significance on the 1% level, however with the first differenced data statistical significance was lost, and I ended up with rather high p-values as if there is no relationship between the two variables.

What is the interpretation of such results (i.e is there a relationship between investment and output in this data set) and which regression should I use in my research?

EconJohn
  • 8,345
  • 6
  • 30
  • 64
  • 1
    Have you tested for cointegration? The first relation might be spurious if there is no cointegration. Also, where is the investment squared in the second equation? – luchonacho Sep 13 '17 at 06:13
  • @luchonacho I didn't add a quadratic term in the second equation because I saw that there was no relationship with the differenced investment variable. I did not test for co-integration. – EconJohn Sep 13 '17 at 15:50
  • Including an intercept in the second equation corresponds to including a linear time trend in the first equation. Now you have one but not the other. – Richard Hardy Sep 13 '17 at 18:37
  • This seems very similar to your previous question here https://economics.stackexchange.com/questions/17695/interpretation-of-a-differenced-regression?rq=1. – Adam Bailey Sep 13 '17 at 20:51
  • @AdamBailey that one is not using panel data. – EconJohn Sep 13 '17 at 20:58

1 Answers1

1

It's hard to say. Presumably you think you have some fixed (unobservable) omitted variables, which is why you ran fixed effects (FE) / first differences. So I see one of two things going on here:

  1. The fixed omitted variables caused significance in the first regression, which means you do not have a significant effect. Do you have fixed omitted variables? If not, go with the first regression instead.

  2. A quadratic model fits the data much better and hence you have significance in the first model, but not the second. In that case, you probably do have a significant effect. Is that true though? Why did you choose a quadratic specification? Does theory or the scatter plot suggest that is a good idea? If not go with the second regression.

My suggestion would be to try fixed effects/first differences with the qudratic terms if you have good reason for both. Otherwise use the specification you think has the most merit.

Unfortunately, I am not familiar with R and cannot comment on the code.

BB King
  • 6,138
  • 1
  • 15
  • 40