Suppose we have overall $m$ time series, each with $n$ observations. We also have another time series with $n-k$ observations ($k>0$). Given the shortest series, I want to find from $m$ series those which have the closest data generating process (DGP). Motivation for this is to "recover" history for the shortest series from available series with longer history. Off the top of my head, I think of looking at correlation coefficients between series, or estimate some model (could be AR/ARMA) and look at coefficients. Is there any documented approach for this kind of exercise?
Asked
Active
Viewed 38 times
1
-
maaaaybe https://stats.stackexchange.com/questions/172439/comparing-clustering-time-series-with-unequal-lengths? – Alberto Mar 28 '24 at 14:20
-
I think this is closer, but still does not answer fully my question -- https://stats.stackexchange.com/questions/19103/how-to-statistically-compare-two-time-series . – Sane Mar 28 '24 at 14:27
1 Answers
2
You will have to define a class of data generating processes or the problem is not well-defined. If you are willing to restrict attention to invertible ARIMA processes for example, then Piccolo distance might do what you want.
See https://www.jstatsoft.org/article/view/v062i01 for a general discussion of distances between time series.
Rob Hyndman
- 56,782
-
Thank you. I am quite new in time series clustering. One thing that I cannot comprehend -- why so called "closest" DGP is mostly understood in a sense of distance? Why should "closest" DGP imply some physical proximity (e.g., in DTW closest series are those which have close dynamic, as far as I understand). Series $X$ and $Y$ could have the same DGP, and are described, e.g., with AR(1) model with the same parameters, but $X$ is scaled version of $Y$. I believe time series clustering methods won't capture this. Isn't the correct approach to find DGP through estimating models and compare them? – Sane Mar 29 '24 at 09:44
-
-
1I think you misunderstand how distance is being used here. "Closest" implies some kind of distance. I think you want distance between DGPs rather than distance between time series realisations. That's what Piccolo's distance is designed to do. – Rob Hyndman Mar 30 '24 at 00:47