10

I am wondering if there are algorithms to cluster graphs; what I meant is to cluster many graphs, not cluster the nodes in a graph.

For example, we have N graphs, G1, G2, G3, .....GN. Then we can identify [G1, G2, G3] have high similarity in topology, [G4, G5] another one, and so on.

TripleH
  • 377
  • 2
    Is there something inadequate about extracting features from the graphs and clustering on those numbers? – Dave Dec 31 '22 at 14:48
  • @Dave we will use features like node labels. – TripleH Jan 05 '23 at 03:06
  • If you opt for clustering with a pre-calculated distance matrix (see here, ht @Sextus Empiricus) you may use various measures of graph distances. Many of them rely on having label-matched graphs or at least the same number of nodes. If your graphs are unlabeled and of different sizes, you can use the NetSimile method (Berlingerio et al. 2012) that models graphs as distributions of local characteristics with which you can calculate distances. I've done a similar thing in a paper. – psyguy Jan 06 '23 at 21:16

1 Answers1

12

The main problem here seems to me to be about defining and finding the (dis)similarity between different graphs.

  • The 'graph edit distance' (defining distance in terms of number of operations neccesary to convert one graph into the other) is a common way that has been described here on stack overflow.

  • In section 4.2 of "Exact and inexact graph matching: methodology and applications" you find alternative methods for inexact graph matching.

    These are: artificial neural networks, relaxation labeling, spectral methods and kernel methods.

    Riesen, Kaspar, Xiaoyi Jiang, and Horst Bunke. "Exact and inexact graph matching: Methodology and applications." Managing and Mining Graph Data (2010): 217-247.

Then, after solving that problem, to perform clustering you can use any clustering method that uses a distance matrix.