Are there algorithms to cluster Graphs, not just cluster nodes in a graph?

Question

I am wondering if there are algorithms to cluster graphs; what I meant is to cluster many graphs, not cluster the nodes in a graph.

For example, we have N graphs, G1, G2, G3, .....GN. Then we can identify [G1, G2, G3] have high similarity in topology, [G4, G5] another one, and so on.

Is there something inadequate about extracting features from the graphs and clustering on those numbers? — Dave, Dec 31 '22 at 14:48
If you opt for clustering with a pre-calculated distance matrix (see here, ht @Sextus Empiricus) you may use various measures of graph distances. Many of them rely on having label-matched graphs or at least the same number of nodes. If your graphs are unlabeled and of different sizes, you can use the NetSimile method (Berlingerio et al. 2012) that models graphs as distributions of local characteristics with which you can calculate distances. I've done a similar thing in a paper. — psyguy, Jan 06 '23 at 21:16

Sextus Empiricus · Accepted Answer · 2022-12-31T14:40:00.957

12

The main problem here seems to me to be about defining and finding the (dis)similarity between different graphs.

The 'graph edit distance' (defining distance in terms of number of operations neccesary to convert one graph into the other) is a common way that has been described here on stack overflow.
In section 4.2 of "Exact and inexact graph matching: methodology and applications" you find alternative methods for inexact graph matching.

These are: artificial neural networks, relaxation labeling, spectral methods and kernel methods.

Riesen, Kaspar, Xiaoyi Jiang, and Horst Bunke. "Exact and inexact graph matching: Methodology and applications." Managing and Mining Graph Data (2010): 217-247.

Then, after solving that problem, to perform clustering you can use any clustering method that uses a distance matrix.

See this question: Clustering with a distance matrix

edited Dec 31 '22 at 14:40

answered Dec 31 '22 at 13:08

Sextus Empiricus

77,915

Thanks for your answer. Are you suggesting, that if we have N graphs, we can compute graph edit distance for every paired graphs, and then form a NxN distance matrix. Then use the above clustering method as you list above for further clustering distance matrix? – TripleH Jan 01 '23 at 14:42
@TripleH yes, we make a matrix with all pairwise distances and based in the matrix we perform the clustering. Now that you state it like that it sounds to me that this can require a lot of computations. Possibly the NxN matrix can also be made sparse as an alternative. – Sextus Empiricus Jan 01 '23 at 15:11
Thanks again. I found an interesting suggestion from early's discussions (How to cluster graphs with same topology, but different weights on the vertices? and Algorithms for Graphs Clustering): we can find the embedding vectors of each graph and then apply Euclidian distance clustering for clustering. Have you tried this idea? – TripleH Jan 02 '23 at 08:17
@TripleH I have never tried this idea with clustering graphs but clustering based on a distance matrix is very common and I used hierarchical clustering along with the correlation matrix that goes along with PCA. – Sextus Empiricus Jan 02 '23 at 10:29

Are there algorithms to cluster Graphs, not just cluster nodes in a graph?

1 Answers1