1

I want to create forecasting for a large quantity of time series. Since they are too many, I am thinking on reducing my data by clustering it into to similar groups. However, I am using SPSS modeler and it is not possible to cluster time series (only static data).

Do you think it makes sense to apply clustering on static data and fit the forecasting model on its centroide?

Which software do you suggest to cluster the time series directly?

Thanks!

Maria
  • 91

1 Answers1

2

"Dynamic factor analysis" might be your answer and is certainly worth reading up on. It aims to identify a small number of underlying (unobserved) factors behind the co-movement of a large number of observed time series. To implement as part of a broader modelling exercise you will probably be looking at R, Matlab, Stata or a specialist time series package like RATS.

There are lots of potential pitfalls, however, including the possibility that in reducing the explanatory variables in your data to the underlying common "structure", you may be throwing away precisely that part of it which is related to your response variable of interest. However, the technique seems widely used, particularly in areas like econometrics that can try to combine it with a theoretical sense of what the structure underneath all those time series may be.

Peter Ellis
  • 17,650
  • Thankyou Peter! But this dynamic factor analysis is an alternative to the clusters? And then can I use time series for the forecast? – Maria Dec 03 '12 at 12:41
  • Yes, it would be an alternative to the clusters (although, in fact, I'm not quite sure how you'd use clusters). You would use the factor analysis to reduce your large number of time series down to a few constructed variables, each of which synthesises a number of the original variables into a single dimension. Then you can use multivariate time series methods (themselves an area of considerable complication) on the result, using the new "factors" as your explanatory variables. – Peter Ellis Dec 03 '12 at 18:54
  • Oh, I'm not sure my answer is the best and most definitive either - there are people on this site with much more experience with complex time series than me. If you still want others to provide input you have an option of un-accepting my answer so the question stays on the "unanswered" list, but just clicking on the up arrow of my answer so that it is marked as "useful" - meaning "useful" but not necessarily the final word on your problem. You could at least wait a few days and see if there are other approaches. – Peter Ellis Dec 03 '12 at 19:29
  • Oh, I'm trying to do what you've told me but I guess I don't have enough "reputation" to mark your answer as useful. I will try it again later! Thank you very much anyway! – Maria Dec 04 '12 at 14:18
  • @Maria, reputation limitations are only for leaving comment (50) and voting up (15) and down (125), not accepting answer. – chl Jan 02 '13 at 10:00