13

Assume having a list column so that your time series is nested, see Convert pandas df with data in a “list column” into a time series in long format. Use three columns: [list of data] + [timestamp] + [duration] for details. The question here is not about how to unnest a list column though. Assume that you already have a long format structure with all available list elements unnested into a normal column, here for example taking 2 lists with 4 list elements, making 8 rows in the end.

Part1: Assume that every list of the original list column has the same number of list elements (here, the first list with 4 items around 08:53, and the second with 4 items around 08:55).

                     value
datetimeindex
2016-05-04 08:53:20  1
2016-05-04 08:53:21  2
2016-05-04 08:53:22  1
2016-05-04 08:53:23  9
2016-05-04 08:55:00  2
2016-05-04 08:55:01  2
2016-05-04 08:55:02  3
2016-05-04 08:55:03  0

Now approaching the actual question. From statsmodels.tsa.seasonal.seasonal_decompose¶ we read:

Definition of period

"period, int, optional"

Period of the series. Must be used if x is not a pandas object or if the index of x does not have a frequency. Overrides default periodicity of x if x is a pandas object with a timeseries index.

What is meant here with "Period of the series"? Is it:

  1. the number of lists, here 2.
  2. the standard size of a list. This would be 4 in the example.
  3. something else than 1./2.

Please also explain what would be different if you had a "Part2 setting", if there is any difference:

Part2: Assume that every list of the original list column has a varying number of list elements (here, the first list with 4 items around 08:53, and the second with just 3 items around 08:55).

                     value
datetimeindex
2016-05-04 08:53:20  1
2016-05-04 08:53:21  2
2016-05-04 08:53:22  1
2016-05-04 08:53:23  9
2016-05-04 08:55:00  2
2016-05-04 08:55:01  2
2016-05-04 08:55:02  3

The examples shall make the question clear, no programming (especially not with the examples) needed for an accepted answer.

Context:

This question arose from decompose() for time series: ValueError: You must specify a period or x must be a pandas object with a DatetimeIndex with a freq not set to None.

questionto42
  • 314
  • 1
  • 2
  • 13
  • 1
    Bounties become more noticeable towards the end of the period, as they move towards the top of the list. That said, it is notable that this Q has only 21 views, a number of which are me checking on it. You may want to ask a question on [meta.stats.SE] to see if there's a way to raise the profile of this Q &/or make it more answerable. (Again, this doesn't seem to be a programming Q, & there are so many Qs on [SO] that they usually get less attention than here.) – gung - Reinstate Monica Sep 09 '20 at 12:49
  • 3
    (1) The time-series data's having previously been stored in a nested list isn't obviously relevant. Some explanation is required of what that has to do with the choice of period for decomposition. (2) Period is the no. observations in a seasonal cycle - e.g. if you've daily observations & weekly seasonality, the period is 7. But your observations are at irregular intervals, & how to perform seasonal decomposition on irregular time series is perhaps the crux of your question (see e.g. https://stats.stackexchange.com/q/244042/17230 & https://stackoverflow.com/q/12623027/1864816). – Scortchi - Reinstate Monica Sep 16 '20 at 08:40
  • @Scortchi would you mind making an answer from your comment? It says that the period parameter does not play a role if the data does not have any cycles. Since there is no answer up to now, you are likely right with all. And it should not be just a comment then. Not sure how to reach you since this Q was migrated. – questionto42 Jan 02 '23 at 00:12
  • @Scortchi-ReinstateMonica Please answer. – questionto42 Feb 28 '23 at 16:30

1 Answers1

1

Without cycles

The "period" parameter does not play a role if the data does not have any cycles. This short example cannot have a seasonality in it. Thus, the answer to the question is that there is no answer.

With cycles, and with observations at regular intervals

The "period" parameter is the number of observations in a seasonal cycle. For example, if you have daily observations and weekly seasonality, the period is 7.

With cycles, and with observations at irregular intervals

Not asked in the example of this question, but if the data had a seasonality and by the same time irregular intervals, there are questions dealing with this:

  • Trend in irregular time series data
  • How to analyse irregular time-series in R

Thanks go to the remarks of user Scortchi.

questionto42
  • 314
  • 1
  • 2
  • 13