查找连接的组件Networkx

5 python dictionary graph nodes networkx

我需要在无向和加权图中找到连接的节点。我确实在这里寻找一些建议,但没有人回答与我的问题有关的问题。这些节点对也碰巧与邻居连接,每对节点在连接时都花了一些时间(以秒为单位)。

例如:

Node  Node time
A      B    34
A      B    56
A      C    09
A      D    5464
A      C    456
C      B    36
C      A    345
B      C    346
Run Code Online (Sandbox Code Playgroud)

所以总体来说A B C是两次连接

Nodes   connected  time
[A B C]    1       34+09+36 = 79
[A B C]    1       56+345+346 = 747
Run Code Online (Sandbox Code Playgroud)

预期输出为

Nodes  connected  time 
[A B C]    2       826

And

Node  connected  time
[A B]   2         90
[B C]   2         382
[A C]   2         354
Run Code Online (Sandbox Code Playgroud)

码:

import networkx as nx
import numpy as np
from collections import defaultdict

count = defaultdict(int)
time = defaultdict(float)

data = np.loadtxt('USC_Test.txt')

for line in data:
    edge_list = [(line[0], line[1])]
    G= nx.Graph()
    G.add_edges_from(edge_list)
    components = nx.connected_components(G)
    count['components'] += 1
    time['components'] += float(line[2])

print components, count['components'], time['components']
Run Code Online (Sandbox Code Playgroud)

输入:

5454 5070 2755.0
5070 4391 2935.0
1158 305  1.0
5045 3140 48767.0
4921 3140 58405.0
5372 2684 460.0
1885 1158 351.0
1349 1174 6375.0
1980 1174 650.0
1980 1349 650.0
4821 2684 469.0
4821 937  459.0
2684 937  318.0
1980 606  390.0
1349 606  750.0
1174 606  750.0
5045 3545 8133.0
4921 3545 8133.0
3545 3140 8133.0
5045 4243 14863.0
4921 4243 14863.0
4243 3545 8013.0
4243 3140 14863.0
4821 4376 5471.0
4376 937  136.0
2613 968  435.0
5372 937  83.0
Run Code Online (Sandbox Code Playgroud)

输出错误

我得到的输出是错误的

Last_node_pair  total_count_of_line  total_time  of Entire input data
Run Code Online (Sandbox Code Playgroud)

我应该去哪里

[5045 3140 4921]  [number_of_times_same_components_connected]   [total_time_components_connected]
Run Code Online (Sandbox Code Playgroud)

Zar*_*dus 4

这里有几个问题:

  1. 您在每次迭代时重新创建图形,因此图形中只有一条边。
  2. 您使用文字字符串“组件”而不是组件变量作为索引,因此您仅在结果字典中保存并显示该单个值。
  3. 最后,您只打印一次结果。在那里,组件变量恰好是图中的最后一个组件(它是分配给该循环变量的最后一个组件),并且您正在打印组件总数和时间,这是所有组件的组件总数和时间因为问题#2。

这是应该有用的东西。出于懒惰,我把数据扫了两遍。

import networkx as nx
import numpy as np
from collections import defaultdict

count = defaultdict(int)
time = defaultdict(float)

data = np.loadtxt('USC_Test.txt')
G = nx.Graph()
for line in data:
    a,b,time = line
    G.add_edge(a, b)

results = defaultdict(lambda: list([0, 0.0]))
components = nx.connected_components(G)
component_map = { } 
component_stats = defaultdict(lambda: list([0,0.0]))
edge_stats = defaultdict(lambda: list([0,0.0]))
for nodes in components:
    for node in nodes:
        component_map[int(node)] = tuple(nodes)

for a,b,time in data:
    component_stats[component_map[a]][0] += 1
    component_stats[component_map[a]][1] += time

    if len(component_map[a]) > 2:
        edge_stats[(a,b)][0] += 1
        edge_stats[(a,b)][1] += time

for nodes,(count,time) in component_stats.iteritems():
    print sorted([ int(n) for n in nodes ]), count, time

print

for nodes,(count,time) in edge_stats.iteritems():
    print sorted([ int(n) for n in nodes ]), count, time
Run Code Online (Sandbox Code Playgroud)

根据您的输入,将产生以下输出:

[606, 1174, 1349, 1980] 6 9565.0
[305, 1158, 1885] 2 352.0
[968, 2613] 1 435.0
[937, 2684, 4376, 4821, 5372] 7 7396.0
[4391, 5070, 5454] 2 5690.0
[3140, 3545, 4243, 4921, 5045] 9 184173.0

[1349, 1980] 1 650.0
[937, 4376] 1 136.0
[606, 1980] 1 390.0
[3140, 4921] 1 58405.0
[937, 5372] 1 83.0
[606, 1349] 1 750.0
[4391, 5070] 1 2935.0
[3545, 4921] 1 8133.0
[1158, 1885] 1 351.0
[3140, 3545] 1 8133.0
[2684, 4821] 1 469.0
[2684, 5372] 1 460.0
[937, 2684] 1 318.0
[1174, 1980] 1 650.0
[3140, 5045] 1 48767.0
[5070, 5454] 1 2755.0
[4376, 4821] 1 5471.0
[606, 1174] 1 750.0
[3545, 5045] 1 8133.0
[4243, 4921] 1 14863.0
[3140, 4243] 1 14863.0
[4243, 5045] 1 14863.0
[937, 4821] 1 459.0
[3545, 4243] 1 8013.0
[1174, 1349] 1 6375.0
[305, 1158] 1 1.0
Run Code Online (Sandbox Code Playgroud)

希望有帮助!

  • 我懂了。那么您想跟踪组件*和*各个边缘吗?我更新了答案来执行此操作。让我知道这是否是您正在寻找的! (2认同)