将配对值的元组(或列表列表)元组拆分为独立的完整集?

use*_*030 5 python algorithm bioinformatics graph-algorithm

我在csv文件中配对了值.配对值都不一定是唯一的.我想将这个大型清单拆分成独立的完整集,以便进一步分析.

为了说明,我的"megalist"就像:

megalist = [['a', 'b'], ['a', 'd'], ['b', 'd'],['b', 'f'], ['r', 's'], ['t', 'r']...]
Run Code Online (Sandbox Code Playgroud)

最重要的是,输出将保留配对值列表(即,不合并值).理想情况下,输出最终会导致不同的csv文件,以便以后进行单独分析.例如,这个预言家将是:

completeset1 = [['a', 'b'], ['a', 'd'], ['b', 'd'], ['b', 'f']]
completeset2 = [['r', 's'], ['t', 'r']]
...
Run Code Online (Sandbox Code Playgroud)

在图论背景中,我试图获取互斥子图的巨大图(其中配对值是连接顶点)并将它们拆分成更易于管理的独立图.感谢您的任何意见!

编辑1:这使我处于一个可以向前迈进的地方.再次感谢!

import sys, csv
import networkx as nx

megalist = csv.reader(open('megalistfile.csv'), delimiter = '\t')

G = nx.Graph()
G.add_edges_from(megalist)

subgraphs = nx.connected_components(G)

output_file = open('subgraphs.txt','w')

for subgraph in subgraphs:
     output_line = str(G.edges(subgraph)) + '\n'
     output_file.write(output_line)

output_file.close()
Run Code Online (Sandbox Code Playgroud)

jte*_*ace 6

你可以使用networkx.构建图表:

>>> import networkx as nx
>>> megalist = [['a', 'b'], ['a', 'd'], ['b', 'd'],['b', 'f'], ['r', 's'], ['t', 'r']]
>>> G = nx.Graph()
>>> G.add_edges_from(megalist)
Run Code Online (Sandbox Code Playgroud)

然后获取subgrahs列表:

>>> subgraphs = nx.connected_components(G)
>>> subgraphs
[['a', 'b', 'd', 'f'], ['s', 'r', 't']]
>>> [G.edges(subgraph) for subgraph in subgraphs]
[[('a', 'b'), ('a', 'd'), ('b', 'd'), ('b', 'f')], [('s', 'r'), ('r', 't')]]
Run Code Online (Sandbox Code Playgroud)