Oha*_*rry 5 python abstract-syntax-tree
我想在整个应用程序中索引所有方法和它们之间的连接(最终是一个包含子目录和文件的目录)。我正在使用ast,循环遍历目录直到单个文件,然后将它们加载到这样的ast对象中ast.parse(self.file_content)
我试图创建的索引是这个 连接
这是我的代码,如果相关的话。
def scan(self):
'''
scans a file line by line while keeping context of location and classes
indexes a file buffer content into a ast, Abstract Syntax Trees, https://en.wikipedia.org/wiki/AST.
Then, iterate over the elements, find relevant ones to index, and index them into the db.
'''
parsed_result = ast.parse(self.file_content)
for element in parsed_result.body:
results = self.index_element(element)
def index_element(self, element, class_name=None):
'''
if element is relevant, meaning method -> index
if element is Class -> recursively call it self
:param element:
:param class_name:
:return: [{insert_result: <db_insert_result>, 'structured_data': <method> object}, ...]
'''
# find classes
# find methods inside classes
# find hanging functions
# validation on element type
if self.should_index_element(element):
if self.is_class_definition(element):
class_element = element
indexed_items = []
for inner_element in class_element.body:
# recursive call
results = self.index_element(inner_element, class_name=class_element.name)
indexed_items += results
return indexed_items
else:
structured_data = self.structure_method_into_an_object(element, class_name=class_name)
result_graph = self.dal_client.methods_graph.insert_or_update(structured_data)
return "WhatEver"
return "WhatEver"
Run Code Online (Sandbox Code Playgroud)
我的问题是,是否可以使用ast. 如果是,如何?根据我的理解,我目前不能,因为我一次加载一个文件到ast对象并且它不知道外部方法。
这是我想在它们之间链接的 2 个文件的示例:
sample_a.py
from sample_class_b import SampleClassB
sample_b = SampleClassB()
class SampleClassA(object):
def __init__(self):
self.a = 1
def test_call_to_another_function(self):
return sample_b.test()
Run Code Online (Sandbox Code Playgroud)
sample_b.py
class SampleClassB(object):
def __init__(self):
self.b = 1
def test(self):
return True
Run Code Online (Sandbox Code Playgroud)
您可以遍历ast.Ast树并在每次递归调用时执行以下四件事之一:
class定义,则存储class名称及其关联的方法,然后应用于Connections.walk每个方法,class并将方法名称存储在范围中。import语句,则加载模块并Connections.walk在其上递归运行。Connections.walk位于方法内,请检查属性名称是否是class当前加载的任何 es 的方法。如果是这样,请添加一条边,edges将当前范围与发现的新方法链接起来。import ast, itertools
import re, importlib
class Connections:
def __init__(self):
self._classes, self.edges = {}, []
def walk(self, tree, scope=None):
t_obj = None
if isinstance(tree, ast.ClassDef):
self._classes[tree.name] = [i for i in tree.body if isinstance(i, ast.FunctionDef) and not re.findall('__[a-z]+__', i.name)]
_ = [self.walk(i, [tree.name, i.name]) for i in self._classes[tree.name]]
t_obj = [i for i in tree.body if i not in self._classes[tree.name]]
elif isinstance(tree, (ast.Import, ast.ImportFrom)):
for p in [tree.module] if hasattr(tree, 'module') else [i.name for i in tree.names]:
with open(importlib.import_module(p).__file__) as f:
t_obj = ast.parse(f.read())
elif isinstance(tree, ast.Attribute) and scope is not None:
if (c:=[a for a, b in self._classes.items() if any(i.name == tree.attr for i in b)]):
self.edges.append((scope, [c[0], tree.attr]))
t_obj = tree.value
if isinstance(t_obj:=(tree if t_obj is None else t_obj), list):
for i in t_obj:
self.walk(i, scope = scope)
else:
for i in getattr(t_obj, '_fields', []):
self.walk(getattr(t_obj, i), scope=scope)
with open('sample_a.py') as f:
c = Connections()
c.walk(ast.parse(f.read()))
print(c.edges)
Run Code Online (Sandbox Code Playgroud)
输出:
[(['SampleClassA', 'test_call_to_another_function'], ['SampleClassB', 'test'])]
Run Code Online (Sandbox Code Playgroud)
重要提示:根据您正在运行的文件的复杂性Connections.walk,RecursionError可能会发生这种情况。为了避免这个问题,这里有一个 Gist,其中包含Connections.walk.
从以下位置创建图表edges:
import networkx as nx
import matplotlib.pyplot as plt
g, labels, c1 = nx.DiGraph(), {}, itertools.count(1)
for m1, m2 in c.edges:
if (p1:='.'.join(m1)) not in labels:
labels[p1] = next(c1)
if (p2:='.'.join(m2)) not in labels:
labels[p2] = next(c1)
g.add_node(labels[p1])
g.add_node(labels[p2])
g.add_edge(labels[p1], labels[p2])
nx.draw(g, pos, labels={b:a for a, b in labels.items()}, with_labels = True)
plt.show()
Run Code Online (Sandbox Code Playgroud)
输出: