2

What are some algorithms for determining the seasonal frequency (or equivalently, the length of the seasonal period) from the values of the time series alone? That is, the values of the time series are given, but the time stamps are missing. However, we know the data has been sampled at equal time intervals.

(Further nuance could be added to the question, e.g. how this should be approached if our goal is to detect the "true" seasonal frequency vs. if our goal is to forecast the time series optimally under a given evaluation loss function. A motivating example for the latter case can be found here. A reference to an R implementation would be a bonus.)

Richard Hardy
  • 67,272
  • I reopened this thread and am kind of wondering whether this one is not simply a duplicate of that one. I would VTC, but then my gold-plated dupe hammer would be single-vote-enough. What do you think? – Stephan Kolassa Mar 02 '23 at 12:12
  • @StephanKolassa, I read the other thread before posting my question, so I do not think it is a duplicate; otherwise I would not have posted it. Johanna seems to be interested in testing for presence of seasonality, while I am interested in determining the seasonal period. These are two distinct problems. The paper you reference in your answer is relevant, but it does not go into any depth of the matter; I hope I can solicit better references. – Richard Hardy Mar 02 '23 at 14:39
  • 1
    I see your point, and I agree. I will post that answer here, because that paper does address the question, and I will also hope for a better and more comprehensive answer. – Stephan Kolassa Mar 02 '23 at 15:06
  • "but the time stamps are missing" in what way is this relevant? Are we supposed to recover the time steps by figuring out daily, weekly, yearly patterns which have ratio's 1:7:365? Or is it relevant in the sense that we are not simply using the time stamps to figure out the patterns, like using default dayly/weekly/yearly patterns without observing the values and without figuring out whether it makes sense. – Sextus Empiricus Mar 16 '23 at 09:01
  • @SextusEmpiricus, a time series is a sequence of pairs (time, value). Let us say we do not see the time but we see the value. Additionally, we know the subsequent times have equal time intervals between them. I did not mean anything more complicated than that. – Richard Hardy Mar 16 '23 at 09:04
  • In this question the computation of a power spectrum worked well to extract components that indicate a seasonal period of 24 hrs. But other situations might be less clear so it may not always work. – Sextus Empiricus Mar 16 '23 at 09:25
  • 1
    @SextusEmpiricus, that is exactly the thread that motivated my question. We can eyeball things, and we may get good at it with experience. However, do we have any sensible algorithms to yield answers in a principled way? (I bet we do, perhaps in the signal processing literature.) – Richard Hardy Mar 16 '23 at 09:31

1 Answers1

2

(cross-posted from here)

Section 3.2 in the following paper offers a possibility for determining the length of the seasonal cycle:

 Wang, X, Smith, KA, Hyndman, RJ (2006) "Characteristic-based
 clustering for time series data", _Data Mining and Knowledge
 Discovery_, *13*(3), 335-364.

However, this is only one aspect in a paper that is more comprehensive in its aims, so the specific issue of determining a seasonal length is not treated at great depth.

Also note that this was never included in the forecast::auto.arima() function (whose author is Hyndman), although this function does use other methods from that paper (for instance, auto.arima() decides whether to apply seasonal differencing for known seasonal cycle length based on an estimate of seasonal strength as also given in Wang et al.).

I do not now why this was never included. It may have been because it was unstable, varying and hard to automate. After all, you need to identify peaks and troughs in the ACF, and what constitutes a "peak" or a "trough" in a noisy ACF series would need to be operationalized.

Alternatively, perhaps there simply never was any demand for it, since users presumably know their seasonal cycle length.

So if you want to use the cycle length determination per Wang et al., you would need to code it yourself.

Stephan Kolassa
  • 123,354