Oz1*_*123 5 python algorithm generator depth-first-search
我正在研究一个项目,我需要为文本处理编写一些规则.在完成这个项目几天并实施一些规则后,我意识到我需要确定规则的顺序.没问题,我们有拓扑排序帮忙.但后来我意识到我不能指望图表总是满满的.所以我想出了这个想法,给定一个带有一组依赖关系(或单个依赖关系)的规则,我需要检查依赖关系的依赖关系.听起来很熟悉?是.该主题与图的深度优先搜索非常相似.
我不是数学家,也不研究CS因此,图论对我来说是一个新领域.尽管如此,我实施了一些工作(见下文)(我怀疑效率低下).
这是我的搜索和产量算法.如果您在下面的示例中运行它,您将看到它多次访问某些节点.因此,推测效率低下.
关于输入的一个词.我写的规则基本上是python类,它们有一个类属性depends.我被批评没有使用inspect.getmro- 但这会使事情变得非常复杂,因为这个类需要相互继承(参见这里的例子)
def _yield_name_dep(rules_deps):
global recursion_counter
recursion_counter = recursion_counter +1
# yield all rules by their named and dependencies
for rule, dep in rules_deps.items():
if not dep:
yield rule, dep
continue
else:
yield rule, dep
for ii in dep:
i = getattr(rules, ii)
instance = i()
if instance.depends:
new_dep={str(instance): instance.depends}
for dep in _yield_name_dep(new_dep):
yield dep
else:
yield str(instance), instance.depends
Run Code Online (Sandbox Code Playgroud)
好了,既然你盯着代码,这里有一些你可以测试的输入:
demo_class_content ="""
class A(object):
depends = ('B')
def __str__(self):
return self.__class__.__name__
class B(object):
depends = ('C','F')
def __str__(self):
return self.__class__.__name__
class C(object):
depends = ('D', 'E')
def __str__(self):
return self.__class__.__name__
class D(object):
depends = None
def __str__(self):
return self.__class__.__name__
class F(object):
depends = ('E')
def __str__(self):
return self.__class__.__name__
class E(object):
depends = None
def __str__(self):
return self.__class__.__name__
"""
with open('demo_classes.py', 'w') as clsdemo:
clsdemo.write(demo_class_content)
import demo_classes as rules
rule_start={'A': ('B')}
def _yield_name_dep(rules_deps):
# yield all rules by their named and dependencies
for rule, dep in rules_deps.items():
if not dep:
yield rule, dep
continue
else:
yield rule, dep
for ii in dep:
i = getattr(rules, ii)
instance = i()
if instance.depends:
new_dep={str(instance): instance.depends}
for dep in _yield_name_dep(new_dep):
yield dep
else:
yield str(instance), instance.depends
if __name__ == '__main__':
# this is yielding nodes visited multiple times,
# list(_yield_name_dep(rule_start))
# hence, my work around was to use set() ...
rule_dependencies = list(set(_yield_name_dep(rule_start)))
print rule_dependencies
Run Code Online (Sandbox Code Playgroud)
为了省去运行代码的麻烦,上面函数的输出是:
>>> print list(_yield_name_dep(rule_wd))
[('A', 'B'), ('B', ('C', 'F')), ('C', ('D', 'E')), ('D', None), ('E', None), ('F', 'E'), ('E', None)]
>>> print list(set(_yield_name_dep(rule_wd)))
[('B', ('C', 'F')), ('E', None), ('D', None), ('F', 'E'), ('C', ('D', 'E')), ('A', 'B')]
Run Code Online (Sandbox Code Playgroud)
在我提出更好的解决方案的同时,上述问题仍然存在.所以随意批评我的解决方案:
visited = []
def _yield_name_dep_wvisited(rules_deps, visited):
# yield all rules by their name and dependencies
for rule, dep in rules_deps.items():
if not dep and rule not in visited:
yield rule, dep
visited.append(rule)
continue
elif rule not in visited:
yield rule, dep
visited.append(rule)
for ii in dep:
i = getattr(grules, ii)
instance = i()
if instance.depends:
new_dep={str(instance): instance.depends}
for dep in _yield_name_dep_wvisited(new_dep, visited):
if dep not in visited:
yield dep
elif str(instance) not in visited:
visited.append(str(instance))
yield str(instance), instance.depends
Run Code Online (Sandbox Code Playgroud)
以上的输出是:
>>>list(_yield_name_dep_wvisited(rule_wd, visited))
[('A', 'B'), ('B', ('C', 'F')), ('C', ('D', 'E')), ('D', None), ('E', None), ('F', 'E')]
Run Code Online (Sandbox Code Playgroud)
因此,您现在可以看到节点E只被访问过一次.
根据 Gareth 和 Stackoverflow 其他用户的反馈,我得出了以下结论。它更清晰,也更普遍:
def _dfs(start_nodes, rules, visited):
"""
Depth First Search
start_nodes - Dictionary of Rule with dependencies (as Tuples):
start_nodes = {'A': ('B','C')}
rules - Dictionary of Rules with dependencies (as Tuples):
e.g.
rules = {'A':('B','C'), 'B':('D','E'), 'C':('E','F'),
'D':(), 'E':(), 'F':()}
The above rules describe the following DAG:
A
/ \
B C
/ \ / \
D E F
usage:
>>> rules = {'A':('B','C'), 'B':('D','E'), 'C':('E','F'),
'D':(), 'E':(), 'F':()}
>>> visited = []
>>> list(_dfs({'A': ('B','C')}, rules, visited))
[('A', ('B', 'C')), ('B', ('D', 'E')), ('D', ()), ('E', ()),
('C', ('E', 'F')), ('F', ())]
"""
for rule, dep in start_nodes.items():
if rule not in visited:
yield rule, dep
visited.append(rule)
for ii in dep:
new_dep={ ii : rules[ii]}
for dep in _dfs(new_dep, rules, visited):
if dep not in visited:
yield dep
Run Code Online (Sandbox Code Playgroud)