4

Suppose ${X_{t}}$ is a non-stationary process. The goal is to estimate the following AR(1) model:

$$X_{t}=\alpha +\beta X_{t-1}+\epsilon_t.$$

From classical time series analysis, we know that estimating a model with non-stationary time series could yield spurious results (unless dependent and independent variables are cointegrated). From another hand, we also know that $X_{t}$ time series is stationary if $|\beta|<1$, and non-stationary otherwise.

Question: Having the above said, is it legit to still estimate an AR(1) model for ${X_{t}}$ and hope that the estimated $\beta$ will exceed 1 in absolute terms, and therefore indicate that the series is not stationary? Or it is not legit to estimate the model, and first differencing is still needed?

EDIT: The motivation for this question is the following. I have thousands of univariate time series, and data generating process (DGP) for each is unknown. For each of them I want to estimate an AR(1) model to estimate the magnitude of $\beta$. Also, I want to determine if the series is stationary or not (or at least have some hint about it), as it is not realistic to manually conduct a formal ADF test for each series (since specification of the test equation requires some manual work -- for example, AIC is not always best way to select lag order of the equation).

Sane
  • 261
  • Why fitting an improper model in the first place? Why not fitting a different model which might actually fit the data's covariance structure? – Spätzle Mar 12 '24 at 13:52
  • One reason for this is to capture whether the model is improper or not through the $\beta$ coefficient. – Sane Mar 12 '24 at 14:06
  • You start by stating that the process isn't stationary. If this is the case, AR(1) is improper even if you get $|\beta|<1$. It's like driving through a puddle with your car and then declaring it to be an amphibious vehicle. – Spätzle Mar 12 '24 at 14:32
  • 1
    @Spätzle, the model seems perfectly fine regardless of the value of $\beta$; it can represent a valid data generating process. The problem is with estimation of the model's parameters. Sane, when you say nonstationary, what do you have in mind? That $X_t$ contains a unit root or something else? What is the data generating process for $X_t$? – Richard Hardy Mar 12 '24 at 14:55
  • 2
    @RichardHardy I added Edit to address your question and give motivation for the question. – Sane Mar 12 '24 at 15:14
  • As the post was originally written, I understood it as if the data is certainly not stationary, hence my responses. If you don't know whether or not the data is stationary - then yes, trying to fit an AR(1) and conducting an hypothesis test (suggested in the answer) is a good idea and I take my comments back. – Spätzle Mar 12 '24 at 17:20
  • Sounds good, thank you! – Sane Mar 13 '24 at 14:24

1 Answers1

4

That will depend a lot on what you want to do with the exercise.

If you want to estimate $\beta$, that will work, and in fact quite well ("superconsistency", cf. Estimation of unit-root AR(1) model with OLS). In particular, the link shows that the OLS estimator converges as in $$ \hat\beta-1=O_p(T^{-1}), $$ and not at rate $O_p(T^{-1/2})$ like in stationary environments.

If you want to test hypotheses about $\beta$, you get nonstandard ("Dickey-Fuller") distributions, again see the link for an example. This would then be a more principled approach to infer if $\beta=1$ than to hope that the estimated $\beta$ exceeds one.

In fact, it is known that the estimator $\hat\beta$ is biased downwards, implying that even estimated values less than one are compatible with a true $\beta=1$. See, e.g., How is the augmented Dickey–Fuller test (ADF) table of critical values calculated? for a plot of the t-ratio Dickey-Fuller distribution.

Richard Hardy
  • 67,272
  • Thanks a lot for the answer. I provided motivation of my question above. – Sane Mar 12 '24 at 15:15
  • 1
    OK, thanks for the additional info. If you put your analysis in a loop and record something like urca::adf.test in R, doing a unit root test for each series (while AIC may not be "optimal", it surely seems at least as principled as looking at the estimated $\hat\beta$s) is not really additional work relative to estimating each $\beta$. – Christoph Hanck Mar 12 '24 at 16:23
  • Many thanks! I will combine estimated values for $\beta$ with the results of ADF test (as you suggested, this gives more information and in actuality a legitimate approach). By the way, as you mentioned, the estimated value of $\beta$ is biased downwards. Is there any threshold for which we can assume series is not stationary? For instance, 0.7 implies stationarity (most likely), and whether 0.9 and 0.95 imply as well or not? In other words, is the bias measurable? – Sane Mar 13 '24 at 14:23
  • 1
    Such rules of thumb are, in my opinion, difficult to justify as the (approximate) bias is a function of both sample size and the true parameter, see e.g. https://www.tandfonline.com/doi/abs/10.1080/03610910802645354. In particular, $E(\hat\beta)\approx \beta-(1+3\beta)/(n-1)$ – Christoph Hanck Mar 13 '24 at 15:33
  • Thank you! In that paper, in particular, Kendall's bias correction was considered which is given by $\hat{\beta^{K}}=\frac{n-1}{n-4}\hat{\beta}+\frac{1}{n-4}$. This does not depend on true value of $\beta$, and therefor can be easily estimated. Do you think in practice it is common to obtain $\hat{\beta}$ and then correct it for bias by using $\hat{\beta^{K}}$? – Sane Mar 14 '24 at 05:56
  • I do not recall that correction having been used in applied work - I could not quite say why as it is, as you say, easy to compute. – Christoph Hanck Mar 14 '24 at 09:45
  • I see, thanks! Any bias correction is common in the literature? If so, which one? – Sane Mar 14 '24 at 09:47
  • 1
    None that I am aware of. Afaics, the literature often seems happy to rely on asymptotic approximations. Here, the estimator is, while biased, consistent, so "no need" to correct anything according to that asymptotic result. – Christoph Hanck Mar 14 '24 at 09:55
  • I see, thanks a lot! – Sane Mar 14 '24 at 10:22