Python 中的深度优先搜索(包括循环)

fal*_*lco 2 python graph-theory depth-first-search

所以我在 StackOverflow 中看到了以下关于 Python 中 DFS 算法的帖子(非常有帮助):

这个python代码是否使用深度优先搜索(DFS)来查找所有路径?

我还有一个需要分析的图(以找到两个节点之间的每条可能的路径),但我还需要在那里包括循环。例如,如果我有这样的图表:

graph = {'Start': ['1'],
             '1': ['2'],
             '2': ['3','End'],
             '3': ['2','End']}
Run Code Online (Sandbox Code Playgroud)

我想要以下输出:

Start, 1, 2, 3, End
Start, 1, 2, End
Start, 1, 2, 3, 2, End
Start, 1, 2, 3, 2, 3, End
Run Code Online (Sandbox Code Playgroud)

是否有任何可能的方法来更改以下代码以执行此操作?

def find_all_paths(graph, start, end, path=[]):
        path = path + [start]
        if start == end:
            return [path]
        if not graph.has_key(start):
            return []
        paths = []
        for node in graph[start]:
            if node not in path:
                paths += find_all_paths(graph, node, end, path)
        return paths

print find_all_paths(graph, 'Start', 'End')
Run Code Online (Sandbox Code Playgroud)

Reu*_*ani 6

这不是您想要用简单的深度优先搜索(DFS)来做的事情。

DFS,深度优先,如果遇到一个循环就会被阻塞。循环是无限深的

如果您想在包含循环时输出(可能使用生成器)到每个节点的无限路径,您应该使用广度优先搜索(BFS)。广度优先意味着循环不会阻止它到达其他路径。

这里的妥协是广度优先搜索消耗更多内存,在运行期间保持更多列表处于活动状态。如果这是一个问题,您应该使用DFS 的迭代深化解决方案

简单的 BFS 解决方案:

使用一个Queue(BFS 使用队列很容易实现,DFs 使用堆栈很容易实现,正如你在这里看到的):

#!/usr/bin/env python

import Queue

graph = {'Start': ['1'],
             '1': ['2'],
             '2': ['3','End'],
             '3': ['2','End']}

expand_queue = Queue.Queue()

def BFS_generator(graph, start, end, path):
    # initialize generator
    expand_queue.put((graph, start, end, path))

    while not expand_queue.empty():
        graph, current, end, path = expand_queue.get()

        if current == end:
            # route done - yield result
            yield path + [current]

        if current in graph:
            # skip neighbor-less nodes
            for neighbor in graph[current]:
                # put all neighbors last in line to expand
                expand_queue.put((graph, neighbor, end, path + [current]))

gen = BFS_generator(graph, "Start", "End", [])

# get only 10 first paths
for _ in xrange(10):
    print next(gen)
Run Code Online (Sandbox Code Playgroud)

输出:

['Start', '1', '2', 'End']
['Start', '1', '2', '3', 'End']
['Start', '1', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', 'End']
['Start', '1', '2', '3', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', '3', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', '3', '2', 'End']
['Start', '1', '2', '3', '2', '3', '2', '3', '2', '3', '2', '3', 'End']
Run Code Online (Sandbox Code Playgroud)

将解决方案推广到 DFS 和 BFS,迭代深化:

您还可以拥有使用 aQueue或 a的更通用的代码,Stack然后您可以使用 BFS(队列)或 DFS(堆栈)轻松获得迭代解决方案。取决于你需要什么。

所以首先创建一个Stack类(出于接口目的,list拥有我们需要的一切):

class Stack():

    def __init__(self):
        self.stack = []

    def get(self):
        return self.stack.pop(0)

    def put(self, item):
        self.stack.insert(0, item)

    def empty(self):
        return len(self.stack) == 0
Run Code Online (Sandbox Code Playgroud)

既然您同时拥有Queue和 堆栈,算法的简单概括,使其具有迭代性和数据结构不可知性,如果我们更改所使用的数据结构,将为我们提供两种解决方案:

def iterative_search(data_structure, graph, start, end, limit=None):
    # initialize generator
    data_structure.put((graph, start, end,[]))

    while not data_structure.empty():
        graph, current, end, path = data_structure.get()

        if limit and len(path) > limit:
            continue

        if current == end:
            # route done - yield result
            yield tuple(path + [current])

        if current in graph:
            # skip neighbor-less nodes
            for neighbor in graph[current]:
                # store all neighbors according to data structure
                data_structure.put(
                    (graph, neighbor, end, path + [current])
                )
Run Code Online (Sandbox Code Playgroud)

把整个事情放在一起:

我们可以看到他们选择了不同的路线(我做graph了一些更改以使其更有趣):

import Queue

class Stack():

    def __init__(self):
        self.stack = []

    def get(self):
        return self.stack.pop(0)

    def put(self, item):
        self.stack.insert(0, item)

    def empty(self):
        return len(self.stack) == 0

graph = {'Start': ['1'],
             '1': ['2'],
             '2': ['3','End'],
             '3': ['2', '4','End'],
             '4': ['3']}


def iterative_search(data_structure, graph, start, end, limit=None):
    # initialize generator
    data_structure.put((graph, start, end,[]))

    while not data_structure.empty():
        graph, current, end, path = data_structure.get()

        # make solution depth limited
        # makes it iterative - for DFS to use all paths
        if limit and len(path) > limit:
            continue

        if current == end:
            # route done - yield result
            yield tuple(path + [current])

        if current in graph:
            # skip neighbor-less nodes
            for neighbor in graph[current]:
                # store all neighbors according to data structure
                data_structure.put(
                    (graph, neighbor, end, path + [current])
                )
Run Code Online (Sandbox Code Playgroud)

现在我们有了迭代 DFS 和 BFS 的通用函数,我们可以比较它们提供的解决方案:

import os

# bfs - using queue
gen = iterative_search(Queue.Queue(), graph, "Start", "End")
print "BFS"
# get only 10 first paths
bfs_path_set = set()
while len(bfs_path_set) < 10:
    bfs_path_set.add(next(gen))

print os.linesep.join(map(str, bfs_path_set))

print "Iterative DFS"
# dfs  - using stack
gen = iterative_search(Stack(), graph, "Start", "End", limit=5)

# get only 10 first paths
dfs_path_set = set()
limit = 1
while len(dfs_path_set) < 10:
    try:
        dfs_path_set.add(next(gen))
    except StopIteration:
        limit += 1
        print "depth limit reached, increasing depth limit to %d" % limit
        gen = iterative_search(
            Stack(), graph, "Start", "End", limit=limit
        )

print os.linesep.join(map(str, dfs_path_set))

print "difference BFS - DFS: %s" % str(bfs_path_set - dfs_path_set)
print "difference DFS - BFS: %s" % str(dfs_path_set - bfs_path_set)
Run Code Online (Sandbox Code Playgroud)

输出:

BFS
('Start', '1', '2', '3', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', '2', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', 'End')
('Start', '1', '2', '3', '4', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', 'End')
('Start', '1', '2', 'End')
Iterative DFS
limit reached, increasing limit to 2
limit reached, increasing limit to 3
limit reached, increasing limit to 4
limit reached, increasing limit to 5
limit reached, increasing limit to 6
limit reached, increasing limit to 7
limit reached, increasing limit to 8
('Start', '1', '2', '3', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', '4', '3', '4', '3', 'End')
('Start', '1', '2', '3', '2', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', '2', 'End')
('Start', '1', '2', '3', '4', '3', 'End')
('Start', '1', '2', '3', 'End')
('Start', '1', '2', '3', '4', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', '3', 'End')
('Start', '1', '2', '3', '2', 'End')
('Start', '1', '2', 'End')
difference BFS - DFS: set([('Start', '1', '2', '3', '2', '3', '2', '3', 'End')])
difference DFS - BFS: set([('Start', '1', '2', '3', '4', '3', '4', '3', 'End')])
Run Code Online (Sandbox Code Playgroud)

笔记:

解决方案的差异:您可以看到解决方案的路径选择有所不同。您可以通过将解决方案的长度设置为11而不是来验证它们是否最终获得了所有路径10(这将使两个解决方案集相同 - 因为它们将消耗所有 9 长度的解决方案)。

内存消耗:需要注意的是,这种 DFS 的实现并不是最佳的,因为它确实存储了沿途节点的所有邻居。它通常应该比 BFS 更好,在消耗它们之前存储更多的邻居,但应该存在使用回溯和递归的更优化解决方案。