Fit single ARIMA/ARMA model to multiple independent time series

Question

I have approximately 1000 different time series each with a single variable, of length ranging from 5 to 500 periods. They are all independent of each other, and each series can be treated as being generated from the same series generator. The series are stationary. Another way of describing the multiple time series is that they are like a single time series that has "restarted" 1000 times after some number of periods.

How can I fit a single ARIMA model to the entire set of time series?

A previous answer here (Estimating same model over multiple time series) suggests concatenating all the time series with padding in-between to wash out serial correlations. That works okay, but even with large padding the correlations remain and affect the fit and predictions. That is to say, The effect of shocks and the resulting errors in the previous time series is carried through the padding to the next time series - the predicted "starting" value of each time series is not a constant.

Thank you!

What do you mean "the correlations remain and affect the fit and predictions?" — jbowman, Jan 11 '23 at 22:00
The effect of shocks and the resulting errors in the previous time series is carried through the padding to the next time series - the predicted "starting" value of each time series is not a constant. — Albeit, Jan 11 '23 at 22:06
No, not really, not if the padding is sufficiently long, which typically isn't all that long. For example, to be very simplistic about it, if you have an ARMA(1,1) model with an AR coefficient of 0.5, a padding of eight periods pretty much wipes out the effect (the MA term stops having an effect with a padding of one period.) — jbowman, Jan 11 '23 at 22:32
You can also discard the short time series from your analysis - a time series of length 5 will contribute essentially nothing when you have other series of 100x the length and many hundreds of series besides. I'd probably start with the longest series and count observations from longest to shortest until I have maybe 95% or 99% of all the data, and toss out the remaining (shortest) series. After all, if you average even 50 data points per series, 50,000 total observations is a lot!! — jbowman, Jan 11 '23 at 22:35
Appreciate the help jbowman! Removing the shortest series is a good idea. However, the best fit is about an ARMA(1,4) model, and even with 50 periods of padding there is significant effects bleeding into the next series. (With a sigma2 of 25, the AR coefficient is 0.99, and the "starting" values predicted range from 25 - 33, so the effects are really hanging around). Given the magnitude of the AR coefficient, I guess that is expected - is there anything else I can do? — Albeit, Jan 12 '23 at 02:43
If the AR coefficient is 0.99, maybe you should try first differencing it, and the AR coefficient should disappear - or get a lot smaller, in the worst case. — jbowman, Jan 12 '23 at 03:07
Since the series are independent, the log-likelihood is the sum of the log-likelihoods, and these are evaluated in linear time w.r.t. the number of observations. If a Kalman Filter is used, it is equivalent to run the KF on each series or running it on the concatenated series, provided that the distribution of the state is reinitialized at each jump from one series to the following one. — Yves, Jan 13 '23 at 08:06

jbowman · Answer 1 · 2023-01-13T16:09:52.683

Given that you have an estimated AR coefficient of 0.99, I doubt that your series is really stationary. Be that as it may, you can simplify your problem considerably simply by differencing, then estimating the differenced model. auto.arima in R's forecast package may even do this for you!

# Simulate an ARMA(1,4) process with the ar component = 0.99
tmp <- arima.sim(n=50000, list(ar=0.99, ma=c(-0.3, 0.2, 0.1, -0.1)))
auto.arima(tmp, stepwise=FALSE, max.d=1, seasonal=FALSE")
Series: tmp 
ARIMA(0,1,4)
Coefficients:
          ma1     ma2     ma3      ma4
      -0.3059  0.1896  0.1025  -0.1027
s.e.   0.0045  0.0046  0.0047   0.0045
sigma^2 = 1.018:  log likelihood = -71378.48

Observe how, even with 50,000 observations, auto.arima selects a first difference rather than the true model and how the AR term simply disappears. (Of course, this won't always happen - but whatever model is selected, it should be quite close to this when converted to an MA-only model.) Also, note that the MA coefficients are fairly well estimated - no postprocessing is needed. You can do the same by hand by first differencing, then estimating the model on the first differenced series via concatenating the different series and padding with NA values. The following three equations show why this works:

$$\begin{eqnarray} (1-B)y_t &=& \phi(B)e_t \\ \mathrm{def:}\;\tilde{y}_t &=& (1-B)y_t \\ \tilde{y}_t &=& \phi(B)e_t \end{eqnarray}$$

In your case, you already have a preliminary estimate of the model, so you have a good idea of how much padding is required.

Here's an example:


series_length <- sample(10:500, size=1000, replace=TRUE)
x <- diff(arima.sim(n=series_length[1], list(ar=0.99, ma=c(-0.3, 0.2, 0.1, -0.1))))
for (i in 2:1000) {
  x <- c(x, rep(NA, 8), diff(arima.sim(n=series_length[i], list(ar=0.99, ma=c(-0.3, 0.2, 0.1, -0.1)))))
}
length(x)
[1] 263407
auto.arima(x, stepwise=FALSE, d=0, max.p=0, seasonal=FALSE)
Series: x 
ARIMA(0,0,5) with non-zero mean
Coefficients:
          ma1     ma2     ma3      ma4      ma5    mean
      -0.3023  0.1896  0.0935  -0.1055  -0.0072  0.0010
s.e.   0.0020  0.0021  0.0021   0.0021   0.0020  0.0017
sigma^2 = 1.004:  log likelihood = -362990.7

We didn't get the model structure quite right, but the parameter estimates are pretty close and the extra MA term is very small.

Even if the model structure appears quite different than the MA(4) you'd like to see, it's not likely to matter much regarding the actual forecasts. I cannot do better than quote Richard Hardy in his answer to Lag selection and model instability for ARIMA-GARCH in rolling windows for forecasting.

The fact the the selected models change frequently between one window to the next may be due not only to frequent structural changes (which is probably unlikely) but to the fact that there are several models that approximate the patterns in the data about equally well, so their AICs are very close. Then changing two data points out of 1000 (dropping the oldest point and adding one new point) can make auto.arima switch between these competing models. I would not worry too much about that, as each of these models likely implies very similar time series patterns. The are probably almost equivalent representations of the same thing.

score 0 · Answer 2 · answered Jan 12 '23 at 07:18

Alternatively, per another answer in that thread, you could fit ARIMA models to each series separately (maybe removing the shorter series first). Either let auto.arima() select a model, and then use the model picked for the majority of your time series. Or fit a range of different models to each series and compare their AICs, finally choosing a model with low average AIC (taking a bit of care since AICs are not comparable between series with different orders of integration or seasonal integration). Or per the linked answer, assess models on holdout samples using some comparable metric like MASE, and choosing the model that performs best.

If you want to forecast, you could also use an ensemble of multiple series, e.g., using the $n$ best-performing ARIMA models per one of the approaches above. You could even try estimating weights for the combination, but that often does not improve matters.

Selecting the order of the ARIMA model I believe is accurate enough with the aggregated series, the issue is getting reasonable coefficients for the chosen order. Using an ensemble of the n-best fitted models may work for forecasting, but seems very rough... — Albeit, Jan 12 '23 at 15:26

Fit single ARIMA/ARMA model to multiple independent time series

2 Answers2

Linked