1

Lets suppose we have a time series with monthly data (frequency=12)

my_tS <- ts(my_monthly_data, start=c(2002,1), frequency=12)

When I plot it, I can see a seasonality every 6 months. So, to remove it, I should compute a differencing every 6 months:

my_tS_stationary <- diff(my_tS, 6)

I check it with KPSS and its stationary. Now, I want to model it with ARIMA(p,d,q)(P,D,Q)[12]. Which value should I use for D? A value of 1 is for 12 months seasolnality (the frequency of my TS), but mine is every 6 months...

Edit: As requested, I Add the images od the graphs I made to check seasonality

enter image description here

enter image description here

enter image description here

EDIT AGAIN: Added season plot... Data are flights through year. Now, with season plot, I think it has a 12 months seasonality with two peaks in the year. Thanks, Stephan and Richard, for guiding me to find this!

enter image description here

Kaikus
  • 219
  • 1
  • 5
  • This does look a lot like 6-month seasonality. Interesting. Can you add seasonplots for both 6-month and 12-month seasonality? And out of interest, what does your time series describe? – Stephan Kolassa Mar 01 '23 at 16:07
  • It looks like 12-month seasonality to me. Both ACF and seasonal decomposition show that. However, it is quite possible that approximating it by 6-month seasonality would not introduce a large approximation error. It would be interesting to run an automated frequency detection algorithm on this series, especially if there exists a version that optimizes an information criterion. (I am quite sure there was a thread or two about determining the length of the seasonal period or the seasonal frequency, but I cannot find them anymore... Would appreciate a reference if anyone finds it/them. Thanks!) – Richard Hardy Mar 01 '23 at 16:50
  • @StephanKolassa, thanks for your guidance. The seasonplot helped me to find the right seasonality! And now I know how to proceed in case I have a real 6 month seasonality ^_^ – Kaikus Mar 02 '23 at 09:18
  • @RichardHardy, as you suspected, the seasonal plot indicates a 12 months seasonality with two peaks through year. Thanks both you and Stepan ^_^ – Kaikus Mar 02 '23 at 09:22
  • @StephanKolassa, do you perhaps remember if there indeed was a thread or two about determining the length of the seasonal period or the seasonal frequency? I cannot find them anymore. – Richard Hardy Mar 02 '23 at 09:27
  • 1
    @RichardHardy: this thread and the paper I note in my answer may be useful. – Stephan Kolassa Mar 02 '23 at 11:12

1 Answers1

1

The D parameter governs the number of seasonal differences, where "seasonal" is understood to be as per the underlying time series frequency. What you are implicitly doing is considering your time series as having a frequency not of 12 (monthly data with yearly seasonality), but of 6 (monthly data with half-yearly seasonality).

So the best solution would be to recode your time series with a new frequency=6 attribute, then feed it into auto.arima() (with or without a hard setting for the D parameter):

> library(forecast)
> auto.arima(ts(AirPassengers,frequency=6))
Series: ts(AirPassengers, frequency = 6) 
ARIMA(4,1,2) with drift

Coefficients: ar1 ar2 ar3 ar4 ma1 ma2 drift 0.2243 0.3689 -0.2567 -0.2391 -0.0971 -0.8519 2.6809 s.e. 0.1047 0.1147 0.0985 0.0919 0.0866 0.0877 0.1711

sigma^2 = 706.3: log likelihood = -670.07 AIC=1356.15 AICc=1357.22 BIC=1379.85

(Removed erroneous images)

Kaikus
  • 219
  • 1
  • 5
Stephan Kolassa
  • 123,354
  • Thanks! I supposed that the frequency in a ts was dictated by the number of elements in a a unit, i.e. 12 if months, 4 if quarters, 52 for weeks, 365 for days..., not related to the seasonality detected in data. I see I was wrong :) – Kaikus Mar 01 '23 at 12:24
  • Yes, that is exactly what it is, but you need to specify it through the frequency parameter when creating a ts object - these objects have no notion of "months" or "quarters" and only pretty-print based on the frequency attribute. A frequency of 4 could be quarterly seasonality in a year - or the seasonality involved in having four shifts of 6 hours each in a day. – Stephan Kolassa Mar 01 '23 at 12:29
  • 1
    @Kaikus, another question is whether you have identified the "actual" seasonal frequency correctly. How did you do that, exactly? – Richard Hardy Mar 01 '23 at 12:33
  • @RichardHardy, with a simple graphic and an ACF/PACF – Kaikus Mar 01 '23 at 14:40
  • @StephanKolassa, so, how do I print them correctly with an 'autoplot'? I can see that, by setting 'frecuency' as 6, autoplot thiks they ara data 'every two months', and the year numbers in x axis ara not correctly indicated – Kaikus Mar 01 '23 at 14:44
  • 1
    @Kaikus, what decision rule did you use based on the graph? The fact that the 6th lag sticks out is not necessarily a sign that the seasonal period is 6; it could just as well be 12. – Richard Hardy Mar 01 '23 at 14:46
  • Unfortunately, I can't help you with autoplot, I'm not familiar with that function. What I usually do in such cases is to use plot.default and turn off the horizontal axis with the parameter xaxt="n", then in a second step call axis(side=1) with appropriate at and labels parameters. – Stephan Kolassa Mar 01 '23 at 14:47
  • @RichardHardy, I have added the graphics I used to decide there was a seasonality – Kaikus Mar 01 '23 at 15:44
  • @StephanKolassa, thanks!!! I'll try :) – Kaikus Mar 01 '23 at 15:47
  • @StephanKolassa, sorry, by mistake I added my charts to your answer. I have removed them (^_^)' – Kaikus Mar 01 '23 at 15:59