Tarjan在python中的强连接组件算法无法正常工作

jmo*_*ora 13 python algorithm directed-graph tarjans-algorithm

根据维基百科的说法,我在Python中实现了Tarjan的强连接组件算法,但它无法正常工作.算法很短,我找不到任何区别,所以我不知道它为什么不起作用.我试图检查原始文件,但找不到它.

这是代码.

def strongConnect(v):
  global E, idx, CCs, c, S
  idx[v] = (c, c) #idx[v][0] for v.index # idx[v][1] for v.lowlink
  c += 1
  S.append(v)  
  for w in [u for (v2, u) in E if v == v2]:
    if idx[w][0] < 0:
      strongConnect(w)
      # idx[w] = (idx[w][0], min(idx[v][1], idx[w][1])) #fixed, thx
      idx[v] = (idx[v][0], min(idx[v][1], idx[w][1]))
    elif w in S:
      idx[v] = (idx[v][0], min(idx[v][1], idx[w][0]))
  if (idx[v][0] == idx[v][1]):
    i = S.index(v)
    CCs.append(S[i:])
    S = S[:i]

E = [('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'), ('E', 'A'), ('A', 'E'), ('C', 'A'), ('C', 'E'), ('D', 'F'), ('F', 'B'), ('E', 'F')]
idx = {}
CCs = []
c = 0
S = []
for (u, v) in E:
  idx[u] = (-1, -1)
  idx[v] = (-1, -1)
for v in idx.keys():
  if idx[v][0] < 0:
    strongConnect(v)

print(CCs)
Run Code Online (Sandbox Code Playgroud)

如果您愿意,可以直观地检查图表.正如您所看到的,这是维基百科中伪代码的一个非常向前的翻译.但是,这是输出:

[['D', 'E', 'F'], ['B', 'C'], ['A']]
Run Code Online (Sandbox Code Playgroud)

应该只有一个强连接组件,而不是三个.我希望这个问题在所有方面都是正确的,如果不是我很抱歉.无论如何,非常感谢你.

sen*_*rle 14

好的,我有更多的时间来考虑这个问题.正如我之前所说,我不再确定过滤边缘是问题所在.事实上,我认为伪代码存在歧义; 是否for each (v, w) in E意味着每个边缘(作为for each建议的字面含义),或者只是每个边缘开头v,(正如您合理假设的那样)?然后,在for循环中,是v有问题的最终vfor循环,因为这将是在Python?或者这又回到了原作v?在这种情况下,伪代码没有明确定义的范围行为!(如果v最后v是循环的最后一个任意值,那将是非常奇怪的.这表明过滤是正确的,因为在这种情况下,一直v意味着同样的事情.)

但是,在任何情况下,代码中的明确错误都在这里:

  idx[w] = (idx[w][0], min(idx[v][1], idx[w][1]))
Run Code Online (Sandbox Code Playgroud)

根据伪代码,这绝对应该是

  idx[v] = (idx[v][0], min(idx[v][1], idx[w][1]))
Run Code Online (Sandbox Code Playgroud)

完成更改后,您将获得预期结果.坦率地说,你犯了那个错误并不会让我感到惊讶,因为你使用了一个非常奇怪和违反直觉的数据结构.这是我认为的改进 - 它只增加了几行,我发现它更具可读性.

import itertools

def strong_connect(vertex):
    global edges, indices, lowlinks, connected_components, index, stack
    indices[vertex] = index
    lowlinks[vertex] = index
    index += 1
    stack.append(vertex)

    for v, w in (e for e in edges if e[0] == vertex):
        if indices[w] < 0:
            strong_connect(w)
            lowlinks[v] = min(lowlinks[v], lowlinks[w])
        elif w in stack:
            lowlinks[v] = min(lowlinks[v], indices[w])

    if indices[vertex] == lowlinks[vertex]:
        connected_components.append([])
        while stack[-1] != vertex:
            connected_components[-1].append(stack.pop())
        connected_components[-1].append(stack.pop())

edges = [('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'), 
         ('E', 'A'), ('A', 'E'), ('C', 'A'), ('C', 'E'), 
         ('D', 'F'), ('F', 'B'), ('E', 'F')]
vertices = set(v for v in itertools.chain(*edges))
indices = dict((v, -1) for v in vertices)
lowlinks = indices.copy()
connected_components = []

index = 0
stack = []
for v in vertices:
    if indices[v] < 0:
        strong_connect(v)

print(connected_components)
Run Code Online (Sandbox Code Playgroud)

但是,我觉得这里使用的全局变量令人反感.你可以在自己的模块中隐藏它,但我更喜欢创建一个可调用类的想法.在仔细观察Tarjan的原始伪代码之后(顺便说一下,它证实了"过滤"版本是正确的),我写了这篇文章.它包括一个简单的Graph类,并做了几个基本测试:

from itertools import chain
from collections import defaultdict

class Graph(object):
    def __init__(self, edges, vertices=()):
        edges = list(list(x) for x in edges)
        self.edges = edges
        self.vertices = set(chain(*edges)).union(vertices)
        self.tails = defaultdict(list)
        for head, tail in self.edges:
            self.tails[head].append(tail)

    @classmethod
    def from_dict(cls, edge_dict):
        return cls((k, v) for k, vs in edge_dict.iteritems() for v in vs)

class _StrongCC(object):
    def strong_connect(self, head):
        lowlink, count, stack = self.lowlink, self.count, self.stack
        lowlink[head] = count[head] = self.counter = self.counter + 1
        stack.append(head)

        for tail in self.graph.tails[head]:
            if tail not in count:
                self.strong_connect(tail)
                lowlink[head] = min(lowlink[head], lowlink[tail])
            elif count[tail] < count[head]:
                if tail in self.stack:
                    lowlink[head] = min(lowlink[head], count[tail])

        if lowlink[head] == count[head]:
            component = []
            while stack and count[stack[-1]] >= count[head]:
                component.append(stack.pop())
            self.connected_components.append(component)

    def __call__(self, graph):
        self.graph = graph
        self.counter = 0
        self.count = dict()
        self.lowlink = dict()
        self.stack = []
        self.connected_components = []

        for v in self.graph.vertices:
            if v not in self.count:
                self.strong_connect(v)

        return self.connected_components

strongly_connected_components = _StrongCC()

if __name__ == '__main__':
    edges = [('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'E'),
             ('E', 'A'), ('A', 'E'), ('C', 'A'), ('C', 'E'),
             ('D', 'F'), ('F', 'B'), ('E', 'F')]
    print strongly_connected_components(Graph(edges))
    edge_dict = {'a':['b', 'c', 'd'],
                 'b':['c', 'a'],
                 'c':['d', 'e'],
                 'd':['e'],
                 'e':['c']}
    print strongly_connected_components(Graph.from_dict(edge_dict))
Run Code Online (Sandbox Code Playgroud)