I have a set of log-return data for a commodity and am unable to identify an appropriate ARMA model. I used auto.arima() function, and the optimized model is (4,0,4) with zero mean. However, when I run the Arima model, I get a warning message that there is a convergence problem with an optim code = 1. I wrote an iterative algorithm in R to identify the best model by minimizing AIC value, which gave me a model of (12,0,7) without mean and (12,0,7) with mean. The AIC score without mean is lower. I run the program for a max of 12 in each AR and MA to identify the best model. My algorithm keeps away from selecting any model that has convergence issues or NaNs in the standard errors for any of the iterative models. With the models that my algorithm selected, I note that there is a serial correlation present in the residuals, as there are outliers in the data. I request help with the following:
- Should I winsorize the log-return data to reduce the impact of outliers (OR)
- Should I use tsclean() to transform the outliers?
My objective is to obtain a model that has zero auto-correlation in the residuals.