When doing sequence analysis using a package such as TraMineR, one can calculate a clustering based on Optimal Matching (OM) distances, and then plot it as a tree. I use agnes to do it, roughly like this:
sequences.sts <- seqdef(sequences.sts)
ccost <- seqsubm(sequences.sts, method = "CONSTANT", cval = 2, with.missing=TRUE)
sequences.OM <- seqdist(sequences.sts, method = "OM", sm = ccost, with.missing=TRUE)
clusterward <- agnes(sequences.OM, diss = TRUE, method = "ward")
plot(clusterward, which.plots = 2)
This gives me a plot of the cluster diagram, and it also gives me an agglomerative coefficient. However, ?agnes.object notes that the agglomerative coefficient (ac) grows as the dataset grows, and therefore it is unsuitable as a way of comparing datasets of different size.
Is there any other way of comparing the overall "degree of clustering", or overall "degree of alignment" in a sequence dataset that allows us to reliably compare datasets of different sizes?
dissassocin TraMineR. The test is decribed here: Studer, M., Ritschard, G., Gabadinho, A. & Müller, N.S. (2011), "Discrepancy Analysis of State Sequences", Sociological Methods and Research. Vol. 40(3), pp. 471-510. – Matthias Studer Dec 07 '13 at 10:38