Clustering of time series

Question

I have a set of almost 1600 time series on 2 years which I want to group into clusters. Do you think this is possible using k-means? Which method do you advice me to use? Is this possible at all using SPSS?

Related (but no SPSS solution): Is it possible to do time-series clustering based on curve shape?. — chl, Oct 10 '12 at 21:16
See this article for situations when k-means is suitable and also other similar questions. — sitems, Oct 10 '12 at 21:27
IMHO I wouldn't use SPSS for that task. I have been using SPSS and Matlab for clustering (I'm a novice in clustering I have to say) but the difference in time and the flexibility is very noticeable between the 2 packages, being Matlab a better option even if it is also a high-level programming language. As far as I read R should be way faster, but I didn't try it yet. Just my 2 cents. — Diego, Oct 10 '12 at 23:48

score 2 · Answer 1 · answered Oct 10 '12 at 23:37

k-means cannot use arbitrary distance functions. It is designed for Euclidean distance.

Euclidean distance however does not work well for high-dimensional data such as your time series (unless you have a really low sampling rate, say 24 months)

For time series, you will probably want to use a time series distance. There are quire a lot designed specifically for different kinds of time series. You really should look at these.

They won't work with k-means, but there are various distance and density-based cluster algorithms (where usually density is defined by distance!) that you should try. However, I have no idea what SPSS supports. I don't know if it has any time series distances, either.

Thankyou for your help @Anony-Mouse. I have 130 weeks of sample actually.. I'm bit scared with the size of my data, but let's see if that works. I'll follow your advice. Thanks! — Maria, Oct 11 '12 at 10:58
Well, 1 measurement per week, or 1 measurement per second, that is what I'm trying to point out... — Has QUIT--Anony-Mousse, Oct 11 '12 at 11:54

score 1 · Answer 2 · answered Jun 01 '17 at 09:46

First of all, yes you can use k-means for cluster those time series. The default implementation of kmeans relies on the Euclidean distance, but can be modified to feed the algorithm with a specific time series distance, like DTW.

Check here for more information: On Clustering Multimedia Time Series Data Using K-Means and Dynamic Time Warping.

Second, i don't think you can use SPSS for those purposes, but i do know that you can use Matlab, there are plenty of implementations of kmeans and DTW avialable.

Clustering of time series

2 Answers2