1

I have 2 columns which are in a dataframe and need to be converted to a bipartite graph: the entries of the columns should be vertices, with an edge for each row, joining the two vertices in that row. I am having issues in doing that.

My data looks like this:

Col1        Col2
9d7051e2    da48d749
611cebdb    93ef5eb4
758f1c7b    6acae826
d09360ac    a33fe922

I tried two methods to convert this dataframe to bipartite graph:

Method #1

G = nx.from_pandas_edgelist(df_train, source='Donor ID', target='Project ID',
                            edge_attr=True, create_using=None)

Method#2

G = nx.Graph()  
G.add_nodes_from(df_train['Donor ID'], bipartite=0)  
G.add_nodes_from(df_train['Project ID'], bipartite=1)  
G.add_edges_from(  
     [(row['Donor ID'], row['Project ID']) for idx, row in df_train.iterrows()])

In either case, bipartite.is_bipartite(G) returns FALSE.

Not sure what I am missing here. Any help on the same would be appreciated.

Few things which I tried to resolve the issue:

  1. Find duplicate records (col1+col2) and removed duplicate ones
  2. Make sure there are no missing values in both the table so that it confirms that there is an edge for every record.
  3. When I tried with a subset of records out of 1Million records, it worked. So as I understand its because of some faulty data but how to find all faulty records and how to fix it is challenging.
  4. The data type is Object. Tried to convert it to string but not able to do so. Any help will be appreciated.

Nothing above helped.

Please guide me what I am missing here or how can I further debug the issue.

Ben Reiniger
  • 11,770
  • 3
  • 16
  • 56
Sanguine
  • 11
  • 3
  • 1
    The only thing I can think of that could be going wrong is that some vertices appear in both columns, and manage to form an odd cycle. What exactly do you mean by 1) remove duplicate records? – Ben Reiniger Feb 16 '19 at 03:36
  • @BenReiniger Sorry forgot to update the post. I got the issue, you are correct there were many records where ProjectId was in DonorId which caused the loop and it made isBipartite false. – Sanguine Feb 17 '19 at 04:11
  • @sean owen : is this question unclear?! – Kasra Manshaei Feb 19 '19 at 10:47

0 Answers0