Why does a time series have to be stationary?

Question

Would like to understand primary reasons for making a data stationary?

I understand that a stationary time series is one whose mean and variance is constant over time. Can someone please explain why we have to make sure our data set is stationary before we can run different ARIMA or ARMA models on it? Does this also apply to normal regression models where autocorrelation and/or time is not a factor?

Stationarity requires more than constant mean and variance. Weak stationarity requires that covariance function $cov(X_t,X_{t+h})$ doe not depend on $t$. — mpiktas, Dec 13 '11 at 10:02
You don't require stationarity to run an AR$I$MA model, since if the $I()$ order is $>0$, it's explicitly nonstationary. Stationarity is an assumption of ARMA, however. — Glen_b, Oct 20 '13 at 21:45
+1 for the order comment, though strictly, that's only if $I$'s order is in ${0, 1, 2, ...}$. For arbitrary orders, there's ARFIMA — conjugateprior, Apr 06 '15 at 10:03
@Glen_b Can ARIMA models be applied fr any non stationary series? Or there are some specific non stationairty cases in which ARIMA cana be applied ? — Nizar, Jun 15 '19 at 09:13

score 114 · Answer 1 · edited Jan 19 '21 at 15:51

Stationarity is a one type of dependence structure.

Suppose we have a data $X_1,...,X_n$. The most basic assumption is that $X_i$ are independent, i.e. we have a sample. The independence is a nice property, since using it we can derive a lot of useful results. The problem is that sometimes (or frequently, depending on the view) this property does not hold.

Now independence is a unique property, two random variables can be independent only in one way, but they can be dependent in various ways. So stationarity is one way of modeling the dependence structure. It turns out that a lot of nice results which holds for independent random variables (law of large numbers, central limit theorem to name a few) hold for stationary random variables (we should strictly say sequences). And of course it turns out that a lot of data can be considered stationary, so the concept of stationarity is very important in modeling non-independent data.

When we have determined that we have stationarity, naturally we want to model it. This is where ARMA(AutoRegressive Moving Average) models come in. It turns out that any stationary data can be approximated with stationary ARMA model, thanks to Wold decomposition theorem. So that is why ARMA models are very popular and that is why we need to make sure that the series is stationary to use these models.

Now again the same story holds as with independence and dependence. Stationarity is defined uniquely, i.e. data is either stationary or not, so there is only one way for data to be stationary, but lots of ways for it to be non-stationary. Again it turns out that a lot of data becomes stationary after certain transformation. ARIMA(AutoRegressive Integrated Moving Average) model is one model for non-stationarity. It assumes that the data becomes stationary after differencing.

In the regression context the stationarity is important since the same results which apply for independent data holds if the data is stationary.

I would suggest you to put this part of your response ("This is where ARMA models come in. It turns out that any stationary data can be approximated with stationary ARMA model, thanks to Wold decomposition theorem. So that is why ARMA models are very popular and that is why we need to make sure that the series is stationary to use these models.") in bold because this is which primarily answers the question. — Outcast, Aug 07 '18 at 10:53
I think people are confused about stationarity requirements though. I think you can have time-series instances that are non-stationary but yet if you train across many instances you can still potentially learn the problem. It needs to go deeper than "time series stationary good non-stationary bad".
People also need to distinguish between features and learning target (class labels whatever). We need clear statements on this. — mathtick, Apr 02 '20 at 10:17

score 47 · Answer 2 · edited Apr 13 '17 at 12:44

What quantities are we typically interested in when we perform statistical analysis on a time series? We want to know

Its expected value,
Its variance, and
The correlation between values $s$ periods apart for a set of $s$ values.

How do we calculate these things? Using a mean across many time periods.

The mean across many time periods is only informative if the expected value is the same across those time periods. If these population parameters can vary, what are we really estimating by taking an average across time?

(Weak) stationarity requires that these population quantities must be the same across time, making the sample average a reasonable way to estimate them.

In addition to this, stationary processes avoid the problem of spurious regression.

Jeffrey Girard · Answer 3 · 2016-09-28T19:29:07.143

To add a high-level answer to some of the other answers that are good but more detailed, stationarity is important because, in its absence, a model describing the data will vary in accuracy at different time points. As such, stationarity is required for sample statistics such as means, variances, and correlations to accurately describe the data at all time points of interest.

Looking at the time series plots below, you can (hopefully) see how the mean and variance of any given segment of time would do a good job representing the whole stationary time series but a relatively poor job representing the whole non-stationary time series. For instance, the mean of the non-stationary time series is much lower from $600<t<800$ and its variance is much higher in this range than in the range from $200<t<400$.

This is probably the most accurate and straightforward answer. — AmirWG, Apr 29 '23 at 20:16

Matthew Gunn · Answer 4 · 2020-04-09T17:56:41.517

An underlying idea in statistical learning is that you can learn by repeating an experiment. For example, we can keep flipping a thumbtack to learn the probability that a thumbtack lands on its head.

In the time-series context, we observe a single run of a stochastic process rather than repeated runs of the stochastic process. We observe 1 long experiment rather than multiple, independent experiments.

We need stationarity and ergodicity so that observing a long run of a stochastic process is similar to observing many independent runs of a stochastic process.

Some (imprecise) definitions

Let $\Omega$ be a sample space. A stochastic process $\{Y_t\}$ is a function of both time $t \in \{1, 2, 3, \ldots\}$ and outcome $\omega \in \Omega$.

For any time $t$, $Y_t$ is a random variable (i.e. a function from $\Omega$ to some space such as the space of real numbers).
For any outcome $\omega$ the series $Y(\omega)$ is a time-series of real numbers: $\{Y_1(\omega), Y_2(\omega), Y_3(\omega), \ldots \}$

A fundamental issue in time series

In Statistics 101, we're taught about a series of independent and identically distributed variables $X_1$, $X_2$, $X_3$ etc... We observe multiple, identical experiments $i = 1, \ldots, n$ where an $\omega_i \in \Omega$ is randomly chosen and this allows us to learn about random variable $X$. By the Law of Large Numbers, we have $\frac{1}{n} \sum_{i=1}^n X_i$ converging almost surely to $\operatorname{E}[X]$.

A fundamental difference in the time-series setting is that we're observing multiple observations over time $t$ rather than multiple draws from $\Omega$.

In the general case, the sample mean of a stochastic process $\frac{1}{T} \sum_{t=1}^T Y_t$ may not converge to anything at all!

For multiple observations over time to accomplish a similar task as multiple draws from the sample space, we need stationarity and ergodicity.

If an unconditional mean $\operatorname{E}[Y]$ exists and the conditions for the ergodic theorem are satisfied, the time-series, sample mean $\frac{1}{T}\sum_{t =1}^T Y_t$ will converge to the unconditional mean $\operatorname{E}[Y]$.

Example 1: failure of stationarity

Let $\{Y_t\}$ be the degenerate process $Y_t = t$. We can see that $\{Y_t\}$ is not a stationary (the joint distribution is not time-invariant).

Let $S_t = \frac{1}{t} \sum_{i=1}^t Y_i$ be the time-series sample mean, and it's obvious that $S_t$ doesn't converge to anything as $t \rightarrow \infty$: $S_1 = 1, S_2 = \frac{3}{2}, S_3 = 2, \ldots, S_t = \frac{t+1}{2}$. A time invariant mean of $Y_t$ doesn't exist: $S_t$ is unbounded as $t \rightarrow \infty$.

Example: failure of ergodicity

Let $X$ be the result of a single coin flip. Let $Y_t = X$ for all $t$, that is, either $\{Y_t\} = (0, 0, 0, 0, 0, 0, 0, \ldots)$ or $\{Y_t\} = (1, 1, 1, 1, 1, 1, 1, \ldots$.

Even though $\operatorname{E}[Y_t] = \frac{1}{2}$, the time-series sample mean $S_t = \frac{1}{t} \sum_{i = 1}^t Y_i$ won't give you the mean of $Y_t$.

score 20 · Answer 5 · answered Sep 28 '16 at 20:47

First of all, ARIMA(p,1,q) processes are not stationary. These are so called integrated series, e.g. $x_t=x_{t-1}+e_t$ is ARIMA(0,1,0) or I(1) process, also random walk or unit root. So, no, you don't need them all stationary.

However, we often do look for stationarity. Why?

Consider the forecasting problem. How do you forecast? If everything's different tomorrow then it's impossible to forecast, because everything's going to be different. So the key to forecasting is to find something that will be the same tomorrow, and extend that to tomorrow. That something can be anything. I'll give you a couple of examples.

In the I(1) model above, we often assume (or hope) that the error distribution is the same today and tomorrow: $e_t\sim\mathcal{N}(0,\sigma^2)$. So, in this case we are saying that tomorrow the distribution will still be normal, and that its mean and the variance will still be the same 0 and $\sigma^2$. This did not make the series stationary yet, but we found the invariant part in the process. Next, if you look at the first difference: $\Delta x_t\equiv x_t-x_{t-1}=e_t$ - this cat is stationary. However, understand that the goal was not really to find the stationary series $\Delta x_t$, but to find something invariant, which was the distribution of errors. It just happens so that in the stationary series by definition there will be invariant parts such as unconditional mean and variance.

Another example, say the true series are: $x_t=\alpha t+e_t$. Say, all we know about the errors is that their mean is zero: $E[e_t]=0$. Now, we can forecast again! All we need is to estimate the growth rate $\alpha$, that's what was invariant and the mean of errors. Every time you find something invariant, you can forecast.

For forecasting we absolutely need to find the constant (time invariant) component in the series, otherwise it's impossible to forecast by definition. Stationarity is just a particular case of the invariance.

score 7 · Answer 6 · edited Nov 15 '12 at 16:17

Since ARIMA is regressing on itself for the most part, it uses a type of self-induced multiple regression that would be unnecessarily influenced by either a strong trend or seasonality. This multiple regression technique is based on previous time series values, especially those within the latest periods, and allows us to extract a very interesting "inter-relationship" between multiple past values that work to explain a future value.

IrishStat · Answer 7 · 2011-12-13T11:02:49.827

Time Series is about analysing the way values of a series are dependent on previous values. As SRKX suggested one can difference or de-trend or de-mean a non-stationary series but not unnecessarily!) to create a stationary series. ARMA analysis requires stationarity. $X$ is strictly stationary if the distribution of $(X_{t+1},\ldots,X_{t+k})$ is identical to that of $(X_1,\ldots,X_k)$ for each $t$ and $k$. From Wiki: a stationary process (or strict(ly) stationary process or strong(ly) stationary process) is a stochastic process whose joint probability distribution does not change when shifted in time or space. Consequently, parameters such as the mean and variance, if they exist, also do not change over time or position. In addition as Cardinal has correctly pointed out below the autocorrelation function must be invariant over time (which means that the covariance function is constant over time) converts to parameters of the ARMA model being invariant/constant for all time intervals.

The idea of stationarity of the ARMA model is closely tied into the idea of invertibility.

Consider a model of the form $y(t)=1.1 \,y(t-1)$. This model is explosive as the polynomial $(1-1.1 B)$ has roots inside the unit circle and thus violates a requirement. A model that has roots inside the unit circle means that "older data" is more important than "newer data" which of course doesn't make sense.

A bit of a quibble: It is not quite clear what you mean when you say that "$X$ is second-order stationary if the first two moments are invariant over time." Normally, when I think of second-order stationarity, I think of the autocorrelation function being invariant over time in addition to the invariance of the mean. This is, of course, a (much) stronger condition than the (naive?) interpretation of the one you state. — cardinal, Dec 13 '11 at 01:23
The mention of second-order stationary seems to have been lost in your most recent edit. Was that intentional? (My original comment was more directed toward second-order stationarity than strict stationarity.) — cardinal, Dec 13 '11 at 02:17
:cardinal I guess I felt that your comment was important and made it clearer as to what was being assumed. If you think the idea of "second order stationary" adds clarity please help me add it to my answer in a way that sheds light in simple straightforward English. — IrishStat, Dec 13 '11 at 10:06

score -4 · Answer 8 · answered Oct 27 '15 at 19:58

-4

In my view stochastic process is the process which is govern by three statistical properties which must be time -invariant .They are mean variance and auto correlation function.Though the first two doesn't tell anything about the evolution of the process in time ,so the third property which is auto-correlation function should be considered which tell one that how the dependence decay as the time proceed (lag).

answered Oct 27 '15 at 19:58

Curiosity

1

7

This confuses being a stochastic process and being stationary, so it starts with a fundamental error. What does your answer add to those already posted? – Nick Cox Oct 27 '15 at 20:08

score -4 · Answer 9 · answered Aug 21 '18 at 14:33

-4

To solve anything we need to model the equations mathematically using statics.

To solve such equations it needs to be independent and stationary(not moving)
In stationary data only we can able to get insights and do mathematical operations(mean, variance etc..) for multi-purpose
In non-stationary, it is hard to get data

During the conversion process, we will get a trend and seasonality

answered Aug 21 '18 at 14:33

saravanan saminathan

239

4

None of your answers make sense. The premise of the question is wrong. Many time series can be deemed to be non stationary both theoretically and observationally. There are many methods to deal with this too for example !. differencing or seasonal differencing the series or 2. including cyclical components such as sine waves. – Michael R. Chernick Aug 21 '18 at 14:49
1

@MichaelChernick During differencing and seasonal differencing we are converting non-stationary series to a stationary one. I accept your point that many time series are non-stationary but to solve it mathematically we need to convert it to stationary one right – saravanan saminathan Oct 16 '19 at 04:36

Why does a time series have to be stationary?

9 Answers9

Some (imprecise) definitions

A fundamental issue in time series

Example 1: failure of stationarity

Example: failure of ergodicity

Linked

Related