4

Suppose that the time series data $(y_1, y_2,..., y_n)$ can be explained through a regression model with $k$ explanatory variables:

(1) $y_t = b_0+b_1x_{1t}+b_2x_{2t}...+b_kx_{kt} + \epsilon_t,\ t=1,2,...,n$

where $(\epsilon_1, \epsilon_2, .., \epsilon_n) \sim N(0,\ \Sigma)$. When serial correlation exists in the residual time series $\epsilon$, we can solve the model (1) through generalized least squares. For simplification, let us assume the correlation structure of the residuals $\epsilon$ is AR(1).

Occasionally I see in literature that, for the same data, some people model the serial correlation of AR(1) with a different model

(2) $y_t = \phi y_{t-1}+b_1x_{1t}+b_2x_{2t}...+b_kx_{kt} + \epsilon_t,\ t=1,2,...,n$

with the assumption of white noise for the residuals: $(\epsilon_1, \epsilon_2, .., \epsilon_n) \sim N(0,\ \sigma^2I)$.

I suppose that the underlying assumption would be different for the model (2). Here are my questions:

1) What exactly is the difference in terms of assumptions between the two models?

2) How to justify the adoption of one model over the other?

3) What is the impact of different choices on statistical inferences about those explanatory variables $x_i$?

4) Any literature that discuss the choice between the two models?

bluepole
  • 2,699
  • 2
    I think all your questions are answered in this lecture notes: https://www.reed.edu/economics/parker/312/tschapters/S13_Ch_2.pdf Read it, and come back with questions that are still not answered or not clear to you – Aksakal Oct 07 '19 at 19:20
  • 1
    If $\epsilon$ is AR(1), then I think the variance matrix would not be tridiagonal: the autocorrelation of AR(1) does not die off after lag 1 (though partial autocorrelation does). – Richard Hardy Oct 07 '19 at 19:30
  • @RichardHardy Thanks! Corrected. – bluepole Oct 07 '19 at 19:34
  • @Aksakal : after quickly reading the lecture note you provided, not sure they talk at all about model (2). – Alexandre C-L Oct 08 '19 at 15:07
  • @bluepole Maybe you could provide examples of different studies, some that apply model (1) for a given variable, and some other that apply model (2) for the same variable ? – Alexandre C-L Oct 08 '19 at 15:13
  • @bluepole, your model (2) is nothing more than application of regression to time series, see section 2.2.2 in the notes, particularly, assumption TS-5 – Aksakal Oct 08 '19 at 15:18
  • @Aksakal Thanks for linking the notes! The material has helped me understand the underlying assumptions for model (1). I suspect that the model (2) might have been adopted as an alternative or approximation to the model (1), but I agree with Alexandre Cazenave-Lacroutz that my questions in the original post regarding the differences between the two models remain. In other words, is model (2) legitimate or valid as an alternative or approximation to model (1)? – bluepole Oct 08 '19 at 16:07

1 Answers1

2

Both models can be used for processes with serially correlated errors, and will produce the same coefficients under the conditions that you outlined. However, in the presence of autocorrelation in errors, the model (2) will not have correct variances of coefficients. You may end up over estimating the statistical significance of coefficients. What do you do about this?

First of all, this is not always an issue. In forecasting applications it may not matter at all. Who cares what's the variance of my coefficient if the model produces good forecasts?

When this does matter, you may take many different paths. For instance, in the lecture notes take a look at Newey-West HAC estimator. It corrects for autocorrelation of residuals and estimates a better covariance matrix of coefficients. GLS is another approach, as well as ARIMA and exponential smoothing and the list goes on.

There's no universal algorithm to choose which approach is best for your problem.

Aksakal
  • 61,310
  • I really appreciate your answer! You're addressing my original question 3). For the rest questions - 1): does model (2) assume endogeneity and white noise (instead of exogeneity and AR noise)? 2) Would model (1) be more appropriate than (2) under the assumption of exogeneity and AR noise? 4) Any literature to show the under- or over-estimation of uncertainty in model (2) as alluded in your answer? – bluepole Oct 08 '19 at 17:50
  • the way you formulated (2) it's not just exogenous errors, it's strictly exogenous errors and unconditionally too. in timeseries the exogeneity is usually not so strongly stated, as the lecture notes show, see weakly exogenous discussion. I think that you need to spend more time reading the notes, because you keep bringing up questions that are answered there. The literature is given in the notes too, e.g. Greene's text. – Aksakal Oct 08 '19 at 18:29
  • The only place that might be related to the model (2) seems to be the "quasi-differencing filter" discussed in section 2.3.3. Unless I might have missed something, that filter is applied to both the time series $(y_1, y_2, ..., y_n)$ and the $x$ regressors, which is still different from the model (2) here. – bluepole Oct 08 '19 at 19:14
  • Also, if I understand it correctly, the Breusch-Godfrey Lagrange multiplier test and Box-Ljung “portmanteau” (or Q) test in section 2.3.2 (where Greene's text is cited) are presented in the notes as approaches to testing the autocorrelation of the residuals for model (1), not directly for model (2). – bluepole Oct 08 '19 at 19:20
  • @bluepole model (2) is just an OLS, it's not even adapted to time series. Section 2.3.3 shows you how you adapt OLS assumptions to fit time series. The matter is that usual cross-sectional assumptions of OLS (aka Gauss Markov) are simply unrealistic in time series set up. Section 2.3.3 shows that even when you relax these assumptions to make them more realistic (sensible) OLS is still useful tool for time series. in other words it shows that in many cases you can still use OLS in time series. some people needed permission to do it :) – Aksakal Oct 08 '19 at 19:39