如何改善网络图的可视化?

sha*_*105 8 python social-networking networkx graph-visualization

我试图在python中使用networkx将相邻矩阵转换为图形。我的“加权”图大约有8000个节点和14000条边。是否有很棒的布局图或其他软件包,工具可以使我的图变得更漂亮?我希望结果是边缘权重越高,节点越近。这样我就可以分析群集节点。

我已经尝试了networkx文档中提供的所有布局。我也尝试使用gephi,但仍然对我的理想不满意。这是在networkx中的外观。它可以显示所有群集,但是对于患有强烈恐惧症的人来说,这看起来有点可怕:

在此处输入图片说明

这是我用来绘制图形的代码:

G = nx.Graph()
for i in range(10000):
    for j in range(10000):
        if mat[i][j] > 10:
            G.add_edge(a[i], a[j], weight=mat[i][j])
pos = nx.nx_pydot.graphviz_layout(G)
plt.figure(figsize=(8, 8))
plt.axis('off')
nx.draw_networkx(G, pos=pos, with_labels=False, node_size=25, edgecolors='black', edge_color='b')
plt.show()
Run Code Online (Sandbox Code Playgroud)

vur*_*mux 10

Let's dive a bit into large graph visualization:

In the context of your question you have three ways to visualize a graph:

  1. Draw the whole graph on the one screen
  2. Draw the graph on the surface that is larger than one screen
  3. Partially draw the graph or reduce the number of drawn elements

We will review these ways one by one:

1. Compress the incompressible

You have 8000 nodes and 14000 edges in your graph. Let's imagine that you only need to draw nodes without edges. So for an average FullHD display you will have:

1920 * 1080 / 8000 = 259 pixels per node. It is:

sqrt(259) = 17

A 17×17 px. node if you will tile the whole display with nodes. If you want to draw node labels, you will have:

17 - 1 * 2 (outer node borders) - 1 * 2 (inner node borders) = 13×13 square. The smallest possible font (I will not talk about esoteric fonts) has 3×3(+1) size so you can store no more than 9 characters per node.

It looks like this:

在此处输入图片说明

And we still haven't drawn graph edges! If we want to do it, we will use:

1920 * 1080 / (8000 + 14000) = 94 pixels per graph element, and:

sqrt(94) = 9.7 - 9×9 px. nodes so it is nearly impossible to have even simple labels on them.

I think it is obvious now that the whole graph on the single screen will always be looking like a horrifying mess. So it's not an option to draw your graph this way.

2. Expand the nonexpandable

If we can't draw the graph on one screen, we can rake our brains and invent a way out - to draw it on several screens! We can do it two ways:

  • Using matplotlib
  • Using graphviz

In matplotlib case we create a very large figure (with figsize argument) and then import our graph to a picture (with plt.savefig):

import networkx as nx 
import matplotlib.pyplot as plt 

fig = plt.figure(figsize=(40, 40)) 
G = nx.fast_gnp_random_graph(300, 0.02, seed=1337) 
nx.draw(G, node_size=30) 
plt.axis('equal') 
plt.show() 
fig.savefig('waka.svg') 
Run Code Online (Sandbox Code Playgroud)

So we will have a big vector picture (here is a small part of it):

在此处输入图片说明

In this case we can use any graph layout available in networkx.

Another way is to use Graphviz external library that will draw our graph. Honestly, in this case networkx will just convert the graph to .dot file and send it to Graphviz. The main disadvantage is that you can't control how Graphviz will draw your graph. But it has pretty good layouts so the result will be acceptable.

Use this Python code to generate .dot file:

import networkx as nx 

G = nx.fast_gnp_random_graph(300, 0.02, seed=1337) 
nx.nx_agraph.write_dot(G, './waka.dot')
Run Code Online (Sandbox Code Playgroud)

and then use Graphviz:

dot -Tsvg waka.dot >waka.svg

or for non-directed graphs:

neato -Tsvg waka.dot -Goverlap=false >waka.svg

so the result picture will be looking like this (here is a small part of the whole picture too):

在此处输入图片说明

Graphviz has the huge amount of various options so you can transform your graph looking nearly as you wish.

3. Cut the uncuttable

In your graph you have something like a giant connected component ( GCC ) and many small subgraphs. The whole picture is about 580×580 size and GCC is about 290×290 size so GCC uses:

(290 × 290) ÷ (580 × 580) × 100 = 25% of your canvas. If you will keep only GCC, you will have 4 times more space to draw! In this case you can draw another connected components on another image.

You can keep only GCC with code like this:

import networkx as nx 

G = nx.fast_gnp_random_graph(300, 0.01, seed=1337)
gcc = max(nx.connected_components(G), key=lambda x: len(x))
H = G.subgraph(gcc)
nx.draw(G, node_size=30, node_color='red')
Run Code Online (Sandbox Code Playgroud)

will draw the original graph:

在此处输入图片说明

And with this last line:

nx.draw(H, node_size=30, node_color='red')

it will draw only the GCC:

在此处输入图片说明


P.S.

What else you can try to do with your graph:

  • Clusterize nodes
  • Play with node size (PageRank etc)
  • Play with edge color
  • Don't draw edges but keep the layout
  • WEEP (in case my answer didn't help you even a bit)