I am trying to fit and forecast water production in a well and this accounts for my end of training thesis. But I got poor prediction from ARIMA and sarima models. I tried with auto ARIMA but it didn't go better. I am trying now to modify the sarima parameters to obtain better results, but it's very tedious.
-
Hi Grace. Welcome to Cross Validated. Could you please provide your data so that the community might get an insight on that? – User1865345 Sep 24 '22 at 06:19
-
1The predicted mean looks pretty close to a flat line, but the data doesn't have an obvious trend either. Have you tried comparing its loss to a naive forecaster? – Galen Sep 24 '22 at 06:52
-
Hello thanks, please I have difficulties in adding the data – Grace Sep 25 '22 at 00:41
-
@Galen no, I will try that – Grace Sep 25 '22 at 01:33
-
please help me add the data – Grace Sep 25 '22 at 01:53
1 Answers
Your autocorrelation has a sinusoidal shape with peaks and troughs at lags 7 and 14. It looks like you have daily data with weekly seasonality, which looks strange for water production (Mother Nature does not really work at weekly granularities), but would make perfect sense for water demand (different patterns of demand during weekdays vs. weekends). If the latter, it may make sense to look at seasonal ARIMA.
We don't know whether you specified the frequency in your data and
auto_arimadecided not to use a seasonal model (which can be a perfectly valid decision if the seasonal pattern is too weak to detect in your series), or whether you didn't, in which caseauto_arimacan't on its own decide which seasonal frequency to use. See here. You can forceauto_arimato use a seasonal model, but this is not guaranteed to improve your forecasts.As Galen mentions, your data does not exhibit any obvious patterns. In such a situation, a flat forecast may actually be the best forecast, possibly even an overall historical mean forecast.
The first thing that jumps out at you in your time series is the one large positive peak and the three large negative ones. If you want to forecast, you should try to understand what happened here and use appropriate predictors, running a regression with (potentially) ARIMA errors. Understanding key drivers is always more important to forecasting than fiddling around with ARIMA orders. Related: How to know that your machine learning problem is hopeless?
- 123,354
-
I actually used a period of 7 in auto arima and it gave meas best model (2,0,1àx(0,0,0)[7] and for your second suggestion I edited my post to include the trends from seasonal_decompose plots. F – Grace Sep 25 '22 at 01:47
-
It's true I only have 116 observations, how Can I do with such a small data set – Grace Sep 25 '22 at 03:39
-
When I tried to reduce the test data, it became less linear. But I can't have more data, I Can only manage with that. – Grace Sep 25 '22 at 03:42
-
116 observations are usually fine for an ARIMA model. an ARIMA (2,0,1)(0,0,0)[7] indicates that
auto_arima()did not choose a seasonal model. You can force it to use a seasonal one, but per above, this is not guaranteed to improve forecasts. – Stephan Kolassa Sep 25 '22 at 06:25 -
auto_arima predicted (2,0,1) as optimal order for ARIMA model but it gave a flat Line forecast and when I tried to reduce the dataset, it became less linear, so I thought the issue with my model was the dataset's size – Grace Sep 25 '22 at 06:50
-
Your forecast is not flat, look closely. It's just that it converges quite quickly to the mean. And this may well be the best one can do with your data. Look through the links in my answer. More data will likely not give you a very different forecast. – Stephan Kolassa Sep 25 '22 at 07:56
-
Because from the link of How to know Your machine learning problem is hopeless, I understood that outliers Can really hinder forecasting – Grace Sep 25 '22 at 09:12
-
The model finally fitted, I removed the peaks(outliers) and it fitted so well. Thanks for your help – Grace Sep 25 '22 at 22:29



