I trained an ARMA model using Python's from statsmodels.tsa.arima.model import ARIMA .
Separated the training and testing data, and fitted the model with parameters p, d, q = 1, 0, 1.
Below is the result of plotting my rolling prediction results (red) against the test (orange). At first, I thought this was not making good predictions.
However, when I simply plotted the prediction results alone, I got something pretty good looking (below). 
This is not the best but I think it's at least catching some trends. So I tried simply scaling it up by a constant multiplier and a bias by eyeballing. Then this happened (last picture).
What is going on? I expected something like the last picture for the forecasted predictions, but somehow all the predicted values are significantly attenuated.
Why would this happen? Why would I get a seemingly good prediction when I randomly scale up and add a bias? Am I getting lucky? How can I do this more systematically?


pmdarimado so for you? – Stephan Kolassa Nov 01 '23 at 06:23