Why, exactly, is a unit root a problem?

Question

Suppose that we have observations $x_1,\dots,x_n$ for some process.

We want to fit an AR(k) model to these observations.

I do not understand why the naïve OLS approach to estimate our AR(k) coefficients would be inappropriate when the process has a unit root. From some simulations it seems to recover the coefficients no matter the location of the roots of the lag polynomial.

If estimating the AR coefficients is the not problem, then what is? A unit root means that uncertainty around your forecast grows with time, but that's not a problem from a mathematical point of view, it just something to keep in mind when interpreting the forecasted values.

Basically I'm asking which part of the mathematical analysis breaks down in the presence of a unit root?

Because a unit root will mess your inferences (tests, interval estimates) up badly. — Alexis, Mar 02 '23 at 16:01
Thanks. So the problem is not with fitting the model, but again with interpreting correctly the conclusion from various tests you do based on the model? — user2520938, Mar 02 '23 at 16:03
The problem is also with fitting the model: how does a single variances estimate reliably or validly estimate a function? How does a point estimate in a sample reliably and validly estimate quantities which are undefined (e.g., when one's sample from a process with unit root has a history which preceded one's first measurement: that leaves the population mean of the process undefined, even though you can generate a cromulent sample mean). — Alexis, Mar 02 '23 at 16:07
Assuming the underlying process is an AR(k) process, you can fit the coefficients and, to the extend possible, give meaningful answers to these questions though. For example, fitting the model one sees a unit root, concludes no process mean exists, but can still say how the process mean evolves over time, in the mathematically precise and rigorous way. — user2520938, Mar 02 '23 at 16:11
For example, suppose I observe the AR(1) process $X_t = 2X_{t-1}+noise$. I can easily deduce the '2' from observations by OLS. I can then say that each observation, the expected value doubles. — user2520938, Mar 02 '23 at 16:13
I suppose I may not be understanding your question. (I am pre-caffeinated, so unsurprising. :D ). Can you say a little more about what you are after in an answer? Aside: unit root does not simply mean "uncertainty grows as time passes", it means any perturbation at time $t$ has an equal sized effect on the time series at all future points in time… memory is infinite with unit root. A root > 1 (as in the example you provide) indicates that memory is not merely infinite, but that a perturbation at time $t$ has a much larger effect on the time series the farther into the future you go. — Alexis, Mar 02 '23 at 16:18
My understanding is: unit root is not a problem. If it presents, it just requires statistical models/tools that are essentially different from traditional ARMA models that are designed for stationary time series. — Zhanxiong, Mar 02 '23 at 16:19
@Zhanxiong I buy that! I also really like user2520938's "…can still say how the process mean evolves over time…" comment. — Alexis, Mar 02 '23 at 16:23
@Alexis I'm trying to understand why one would assume an AR(k) model for some k, and then do a unit root test? If we assume an AR(k) process, why don't we just fit the model, and if there happens to be a unit root we must just keep that in mind and be careful about the inference we do based on the model (for reasons as in your first comment). But the AR(k) is perfectly usable and well-defined also if there is a unit root. — user2520938, Mar 02 '23 at 16:23
@user2520938 To make an analogy, you can always fit an ordinary regression model using least squares to a binary dataset instead of using the more sensible logistic regression model, there is no mathematics issue with that -- just like if you try to fit an AR($k$) model to a non-stationary time series (in essence, "unit root" non-stationary is a type of non-stationary) without any discretion. However, this is obviously not a useful model for either explanation or forecasting. "Usable and well-defined" does not automatically mean the model is "good" or "appropriate". — Zhanxiong, Mar 02 '23 at 16:32
@Zhanxiong When you observe a random walk and then try to fit an AR(1) model to it, you'll find a model with a unit root. That seems totally 'good' and 'appropriate' to me, as long as one interprets the outcome of the model correctly? I still do not understand why you say 'If it presents, it just requires statistical models/tools that are essentially different from traditional ARMA models that are designed for stationary time series'. ARMA models with unit roots seem perfectly fine to me. — user2520938, Mar 02 '23 at 16:36
@user2520938 If you observe a random walk, then just model it with random walk -- there is no point to fit an AR(1) for it. Yes, if viewing ARMA models from a pure mathematical perspective, it can encompass non-stationary cases, however, those cases are not useful from statistical inference perspective. A much better strategy is to treat unit-root non-stationary more explicitly. You keep saying "as long as one interprets the outcome of model correctly?", but how? and why bother interpreting a model that is not useful? — Zhanxiong, Mar 02 '23 at 16:44
@Zhanxiong I just do not understand why 'however, those cases are not useful from statistical inference perspective'? Can you give an example to make clear why these models are not useful? — user2520938, Mar 02 '23 at 16:48
To complement my comment you quoted, a quote from A Primer on Unit Root Testing by Phillips and Xiao may be useful: ... Any attempt to explain or forecast series of this type requires that a mechanism be introduced to capture the nonstationary elements in the seires, or that the series be transformed in some way to achieve stationary. — Zhanxiong, Mar 02 '23 at 16:48
@Zhanxiong Again, I still do not understand why that quote is true. Everybody just repeats the unit root models aren't useful/good/nice/whatever, but no one ever explains why. — user2520938, Mar 02 '23 at 16:50
I am pretty sure this question is a duplicate of one or more earlier threads. It may be useful to locate them and familiarize yourself with the answers there. — Richard Hardy, Mar 02 '23 at 17:07
@RichardHardy I've looked at them but found no convincing answers. — user2520938, Mar 02 '23 at 17:08
Here is one specific one - the usual t-test does not follow a normal/t-distribution even asymptotically if the process has a unit root: https://stats.stackexchange.com/questions/326741/why-is-the-dickey-fuller-test-different-from-a-simple-t-test Whether you call that a "problem" surely is a matter of taste. — Christoph Hanck, Mar 03 '23 at 07:35
Here is another one (more of a "problem", I'd argue): inclusion of other regressors (deterministics) changes the asymptotic null distribution, and it is often not clear in practice what to include: https://stats.stackexchange.com/questions/213551/how-is-the-augmented-dickey-fuller-test-adf-table-of-critical-values-calculate/213589#213589 — Christoph Hanck, Mar 03 '23 at 07:37
Here is a somewhat less relevant, I would say - unit roots in MA components leading to ill defined long run variances often needed for standard errors: https://stats.stackexchange.com/questions/65716/what-is-the-implication-of-unit-root-of-ma/167427#167427 — Christoph Hanck, Mar 03 '23 at 07:37
Arguably the most serious issue (among the ones I list, at least) is that when you investigate relationships between multiple unit root variables, you run into potential spurious regression, i.e. plims of regression coefficients that do not tend to zero and diverging t-ratios even if there is no relationship. https://stats.stackexchange.com/questions/188218/cointegration-and-correlation/188255#188255 Cointegration then becomes a relevant concept to make such studies meaningful. — Christoph Hanck, Mar 03 '23 at 07:45
Btw, as far as point estimation rather than testing is concerned, it is indeed correct that unit roots do not cause problems - to the contrary, we even have superconsistency: https://stats.stackexchange.com/questions/145864/estimation-of-unit-root-ar1-model-with-ols/145877#145877 — Christoph Hanck, Mar 03 '23 at 12:14
@ChristophHanck So is it fair to say that the situation with regards to estimating the parameter is that: we do get a superconsistent estimator using the usual OLS formula, but assuming the null hypothesis, the standard test statistic used with OLS does not follow the expected t distribution, so one has to be careful when assessing significance of the estimate? — user2520938, Mar 03 '23 at 13:14
Yes, as far as autoregressions are concerned that is the case. Things look different for OLS when regressing unrelated unit root variables onto each other, see my 2nd to last comment — Christoph Hanck, Mar 03 '23 at 13:26
@ChristophHanck Thanks a lot! Especially for also for the detailed formula's in your other answers. I did some simulations and indeed see that the distribution for test statistic is very different for unit root AR(1) process. This makes everything much clearer. — user2520938, Mar 03 '23 at 13:54

Durden · Answer 1 · 2023-03-04T18:12:23.113

Adapting this from Bauwens & Lubrano (1999), the part of the statistical procedure that "breaks down" in the presence of unit roots is asymptotic normality of the (OLS) estimator. For a model as simple as $$ y_{t} = \rho y_{t-1} + \epsilon_t $$ the asymptotic distribution of $\hat{\rho}_{OLS}$ is $\sqrt{T}(\hat{\rho}_{OLS}-\rho) \to N(0,1-\rho^2)$ if $|\rho| <1$, but $$ T(\hat{\rho}_{OLS}-\rho) \to \frac{1}{2} \frac{w(1)^{2} - 1}{\int_{0}^{1} w(r)^{2} \mathrm{d}r} \quad \quad \text{if } \rho =1.$$ where $w(\cdot)$ is a Wiener process. So in the presence of a unit root, the OLS estimator converges much faster (i.e., it is superconsistent) but to a random quantity instead of a constant. As a practical matter, any hypothesis test involving $\rho$ will require special tables.

thanks also for the reference to that book, looks interesting — user2520938, Mar 04 '23 at 08:17

score -2 · Answer 2 · answered Mar 02 '23 at 16:26

-2

The process is very different in the presence of a unit root. Many processes increase or decrease exponentially. But if there is a unit root, the process is a random walk, which is different.

answered Mar 02 '23 at 16:26

chrishmorris

1,780

2

But I'm looking for a concrete example of why it matters? A specific example of something that breaks/does not work, from a mathematical point of view, in the presence of a unit root. – user2520938 Mar 02 '23 at 16:30
E.g. regression question give misleading answers, but that's not a problem from a mathematical pov, just from an interpretation pov. Another way of phrasing my question would be why time series text books usually assume the stationarity throughout. It seems that a lot of the analysis with regards to AR models in most books work in presence of unit roots without problem. – user2520938 Mar 02 '23 at 16:31

Why, exactly, is a unit root a problem?

2 Answers2