Each data point in my data set is a "multiple compositions" (I'm not sure if this is the correct word for this kind of data). For example:
Data point $X_i = \{a_1, a_2, a_3, b_1, b_2, b_3, c_1, c_2, c_3\}$
$a_1 + a_2 + a_3 = 1,\ b_1 + b_2 + b_3 = 1,\ c_1 + c_2 + c_3 = 1$
$a_j, b_j, c_j > 0\ \forall j=1..3$
I want to do hierarchical clustering on this data, but I don't know how to define the distance matrix for it. The literature in compositional data seems to deal with single-composition data only (e.g., package $compositions$ in R).
Could anyone help to suggest a solution for it please?
EDIT: Some additional details on my case:
I'm doing an analysis on behavior of optimization algorithms on a set of instances. For every pair of (algorithm A, instance I), a composition of three components (as ratios) is generated. Such a composition represents behavior of algorithm A on instance I. Now I would like to cluster these algorithms based on their behaviors on the instance set, using hierarchical clustering with average linkage method.
May I just calculate the distance matrix for each composition based on Aitchison's distance, and sum them up for every algorithm pair?
a1+a2+a3=1one of the 3 is redundant. You are left with the vector of length 6 (say,a1,a2,b1,b2,c1,c2) for each case. Now compute, between cases, any distance you see reasonable. Euclidean seems ok for me. Or dot product, if you wish a similarity. If you don't like the suggestion - tell why, please. – ttnphns Nov 10 '15 at 08:16scaleincompositionspackage) each composition before calculating the distances. – marc1s Nov 10 '15 at 12:24