3

I'm new to ARIMA modelling, and am trying to use the xlstat app to model alcohol sales using ARIMA(0,1,0) based on yearly data. These are the data points I'm using: enter image description here

The dickey-fuller test shows that the data is stationary and looking at the acf and pacf I decided to use an ARIMA(0,1,0) to model the data. enter image description here When I modelled the data, however, the ARIMA seems to be predicting values that were just the observed values a year ago. For example the predicted value for 2018 is the exact same value observed in 2017, and the other points all seem to be based on points in the previous year. enter image description here I have tried this with a bigger dataset and it did the same thing again. Why is this happening and how can I fix it so that the ARIMA is actually predicting a value rather than 'copying' the past data points? Thank you very much for your help, I'm doing this for my extended essay right now so would really appreciate any fast replies.

Richard Hardy
  • 67,272
Bill
  • 31

3 Answers3

8

As frank writes, predicting last year's data point is exactly what an I(1) model is all about, so there is no reason for concern here.

ARIMA modeling is quite non-trivial, and the old Box-Jenkins method of reading entrails (oops, ACF/PACF plots) to decide on ARIMA orders has really been superseded by the more modern method of using information criteria. I would very strongly recommend using an established and trustworthy software package to model and forecast your series, like the fable package for R, or the somewhat older forecast package (which I personally prefer). You may want to take a look at the free online forecasting textbook Forecasting: Principles and Practice by Athanasopoulos & Hyndman, 2nd ed. using forecast or 3rd ed. using fable.

Here is a model using forecast:

alcohol <- ts(c(7.347967697,7.458414004,7.448101626,7.390174832,7.497957882),start=2013)
library(forecast)

model <- auto.arima(alcohol) summary(model) plot(forecast(model,h=3))

ARIMA

As you see, forecast::auto.arima() fits an ARIMA(0,0,0) model with a nonzero mean, as Richard Hardy suggests: a flat line at the overall historical average. This is very often better than a more complex ARIMA model, and especially so for such a short time series as you have here.

More generally, if there are no detectable dynamics in your time series, a flat line forecast, whether the overall mean from an ARIMA(0,0,0)+c or the last observation from an I(1), is quite probably the best possible forecast. We have a number of previous threads asking about flat forecasts in ARIMA models.

Incidentally, the ADF test in the tseries R package does not even want to return a result, probably simply because the series is so short:

> library(tseries)
> adf.test(alcohol)
    Augmented Dickey-Fuller Test

data: alcohol Dickey-Fuller = NaN, Lag order = 1, p-value = NA alternative hypothesis: stationary

I would trust this implementation of the test more than many others.

Stephan Kolassa
  • 123,354
5

The ARIMA(0, 1, 0) model is just the I(1) model, which is: $$ y_{t+1} = y_t + \epsilon_{t+1}, $$ also called the random walk. As you can see, the expectation of $y_{t+1}$ is then just the previous value $y_t$, so the program is doing exactly what you have told it to do.

But just continuing with the latest value is probably the best you can do here. Look at your data. You have only a few data points, so, not much to learn from, and those points don't give you much of a pattern.

frank
  • 10,797
4

The dickey-fuller test shows that the data is stationary and looking at the acf and pacf I decided to use an ARIMA (0,1,0) to model the data.*

This is a strange decision, as your ADF test result implies absence of a unit root while your ARIMA(0,1,0) model implies the opposite. The ACF and PACF actually imply ARMA(0,0,0), i.e. $x_t=c+\varepsilon_t$ where $c$ is the mean of the time series $\{x_t\}$ and $\{\varepsilon_t\}$ is an uncorrelated zero-mean sequence of errors.

Richard Hardy
  • 67,272
  • Note that at least the R version of the ADF test actually tests for a unit root in the presence of a trend, which is of course not the same as an I(1), but it's not all that far away from it, either, so these statements do not seem all that contradictory to me. +1, in any case. – Stephan Kolassa Jul 23 '22 at 06:50
  • 1
    @StephanKolassa, thank you. Note also that there are tons of ADF tests implemented in different packages for R, so "the R version" is an unfortunate expression. – Richard Hardy Jul 23 '22 at 07:42
  • Yes, you are right, good point. I was specifically referring to the one in the tseries package and will take care to mind my words in the future. – Stephan Kolassa Jul 23 '22 at 07:43