I am trying to make a co-hashtag network using networkx in Python.
I can see that the edges and nodes exceed the frame/axes in the plot, but modifying the coordinates does not seem to help. Any help is appreciated!
Short description of data:
high_weight:
0 1 count count_divided
434934 fakenews fakenews 58153 290.765
431143 fakenews enemyofthepeople 28553 142.765
462561 fakenews maga 24406 122.030
503028 fakenews trump2020 18401 92.005
435421 fakenews fakenewsmedia 15930 79.650
... ... ... ... ...
1331708 wwg1wga trump2020 1809 9.045
434967 fakenews fakenewsalert 1783 8.915
435987 fakenews fakepolls 1753 8.765
518845 fakenewscnn fakenews 1747 8.735
482247 fakenews qarmy 1730 8.650
...
text.hashtags:
1 [fakenews]
2 [fakenews]
4 [fakenews]
5 [fakenews, qanon, wwg1wga, greatawakening, psb...
6 [press, fakenews, decency, leadership, potus46...
...
16373658 [liberal, fakenews, liberal, fakenewsmedia]
16373664 [fakenews, hotwheels, diecast, diecastorpaper]
16373667 [fakenews]
16373674 [fakenews]
16373676 [fakenews]
Below is the code I'm using to create and plot the network:
# Create network
g_high_weight = nx.from_pandas_edgelist(high_weight, source=0, target=1, edge_attr='count_divided')
pos_new = nx.circular_layout(g_high_weight)
# Count number of tweets
t=list(text.hashtags)
flat_list = [item for sublist in t for item in sublist]
count = collections.Counter(flat_list)
# Taking only high weight nodes
nodelist_w_size = {key: count[key] for key in count.keys() & set(high_weight[0])}
# Adding attributes
nx.set_node_attributes(g_high_weight, values = nodelist_w_size, name='size')
# Edge weights
widths = nx.get_edge_attributes(g_high_weight, 'count_divided')
# Node size
node_size = nx.get_node_attributes(g_high_weight, 'size')
# Node list
nodelist = g_high_weight.nodes()
# Dictionary changes
# The network needs a list with nodes / edges and an array with values
# Dividing values with 5 to minimize node size
size_node_dict = {k:(float(v)/50) for k, v in nodelist_w_size.items()}
key_list_nodes = list(size_node_dict.keys())
size_node_array = np.array(list(size_node_dict.values()))
widths_dict = {k:(float(v)) for k, v in widths.items()}
widths_list = list(widths_dict.keys())
widths_node_array = np.array(list(widths_dict.values()))
plt.figure(figsize=(40,30))
# plt.axis("scale")
# Nodes and edges coordinates - this is what the network does not react to
pos_nodes_edges = {}
for node, coords in pos_new.items():
pos_nodes_edges[node] = np.array([coords[0]/2, coords[1]/2])
print(pos_nodes_edges)
nx.draw_networkx_nodes(g_high_weight,
pos = pos_nodes_edges,
nodelist = key_list_nodes,
node_size = size_node_array,
node_color ='grey',
alpha = 0.8)
# Edges
nx.draw_networkx_edges(g_high_weight,
pos = pos_nodes_edges,
edgelist = widths_list,
width = widths_node_array,
edge_color='lightblue',
alpha=0.6)
# Labels
# Make offset of labels so that they are not in the middle of the node
pos_attrs = {}
for node, coords in pos_new.items():
pos_attrs[node] = (coords[0]/2, coords[1]/2 + 0.02)
# Add labels to network
nx.draw_networkx_labels(g_high_weight,
pos_attrs,
labels=dict(zip(nodelist, nodelist)),
font_color='black',
font_size = 18
)
plt.tight_layout()
plt.savefig(os.path.join("figs",f"{name}_network.png"),dpi=300, bbox_inches="tight")
print(f'[INFO] network visualization saved in figs as {name}_network.png')
plt.box(False)
plt.show()
And the resulting plot can be found here: