1

I am trying to fit and forecast water production in a well and this accounts for my end of training thesis. But I got poor prediction from ARIMA and sarima models. I tried with auto ARIMA but it didn't go better. I am trying now to modify the sarima parameters to obtain better results, but it's very tedious.


Here is the curve of water production against The pacf and acf [prediction from the auto arima modelWhen I manually try to modidy sarima parameters The trends When I manually try to modify some parameters on my own

Grace
  • 13

1 Answers1

1
  1. Your autocorrelation has a sinusoidal shape with peaks and troughs at lags 7 and 14. It looks like you have daily data with weekly seasonality, which looks strange for water production (Mother Nature does not really work at weekly granularities), but would make perfect sense for water demand (different patterns of demand during weekdays vs. weekends). If the latter, it may make sense to look at seasonal ARIMA.

    We don't know whether you specified the frequency in your data and auto_arima decided not to use a seasonal model (which can be a perfectly valid decision if the seasonal pattern is too weak to detect in your series), or whether you didn't, in which case auto_arima can't on its own decide which seasonal frequency to use. See here. You can force auto_arima to use a seasonal model, but this is not guaranteed to improve your forecasts.

  2. As Galen mentions, your data does not exhibit any obvious patterns. In such a situation, a flat forecast may actually be the best forecast, possibly even an overall historical mean forecast.

  3. The first thing that jumps out at you in your time series is the one large positive peak and the three large negative ones. If you want to forecast, you should try to understand what happened here and use appropriate predictors, running a regression with (potentially) ARIMA errors. Understanding key drivers is always more important to forecasting than fiddling around with ARIMA orders. Related: How to know that your machine learning problem is hopeless?

Stephan Kolassa
  • 123,354
  • I actually used a period of 7 in auto arima and it gave meas best model (2,0,1àx(0,0,0)[7] and for your second suggestion I edited my post to include the trends from seasonal_decompose plots. F – Grace Sep 25 '22 at 01:47
  • It's true I only have 116 observations, how Can I do with such a small data set – Grace Sep 25 '22 at 03:39
  • When I tried to reduce the test data, it became less linear. But I can't have more data, I Can only manage with that. – Grace Sep 25 '22 at 03:42
  • 116 observations are usually fine for an ARIMA model. an ARIMA (2,0,1)(0,0,0)[7] indicates that auto_arima() did not choose a seasonal model. You can force it to use a seasonal one, but per above, this is not guaranteed to improve forecasts. – Stephan Kolassa Sep 25 '22 at 06:25
  • auto_arima predicted (2,0,1) as optimal order for ARIMA model but it gave a flat Line forecast and when I tried to reduce the dataset, it became less linear, so I thought the issue with my model was the dataset's size – Grace Sep 25 '22 at 06:50
  • Your forecast is not flat, look closely. It's just that it converges quite quickly to the mean. And this may well be the best one can do with your data. Look through the links in my answer. More data will likely not give you a very different forecast. – Stephan Kolassa Sep 25 '22 at 07:56
  • Because from the link of How to know Your machine learning problem is hopeless, I understood that outliers Can really hinder forecasting – Grace Sep 25 '22 at 09:12
  • The model finally fitted, I removed the peaks(outliers) and it fitted so well. Thanks for your help – Grace Sep 25 '22 at 22:29