如何根据依赖解析器的输出生成树?

VIV*_*VEK 5 python dictionary nlp nltk stanford-nlp

我试图从依赖解析器的输出中创建一棵树(嵌套字典)。这句话是“我在睡梦中射杀了一头大象”。我能够获得链接中所述的输出: How do I do dependency parsing in NLTK?

nsubj(shot-2, I-1)
det(elephant-4, an-3)
dobj(shot-2, elephant-4)
prep(shot-2, in-5)
poss(sleep-7, my-6)
pobj(in-5, sleep-7)
Run Code Online (Sandbox Code Playgroud)

为了将此元组列表转换为嵌套字典,我使用了以下链接: 如何将 python 元组列表转换为树?

def build_tree(list_of_tuples):
    all_nodes = {n[2]:((n[0], n[1]),{}) for n in list_of_tuples}
    root = {}    
    print all_nodes
    for item in list_of_tuples:
        rel, gov,dep = item
        if gov is not 'ROOT':
            all_nodes[gov][1][dep] = all_nodes[dep]
        else:
            root[dep] = all_nodes[dep]
    return root
Run Code Online (Sandbox Code Playgroud)

输出如下:

{'shot': (('ROOT', 'ROOT'),
  {'I': (('nsubj', 'shot'), {}),
   'elephant': (('dobj', 'shot'), {'an': (('det', 'elephant'), {})}),
   'sleep': (('nmod', 'shot'),
    {'in': (('case', 'sleep'), {}), 'my': (('nmod:poss', 'sleep'), {})})})}
Run Code Online (Sandbox Code Playgroud)

为了找到根到叶路径,我使用了以下链接:Return root to certain leaf from a Nested Dictionary Tree

[制作树和查找路径是两个独立的事情]第二个目标是找到从根到叶节点的路径,就像完成从嵌套字典树返回根到特定叶一样。但我想获得根到叶(依赖关系路径)所以,例如,当我调用 recurse_category(categories, 'an') 时,其中categories是嵌套树结构,'an'是树中的单词,我应该得到ROOT-nsubj-dobj(依赖关系直到根)作为输出。

amy*_*amy 0

这会将输出转换为嵌套字典形式。如果我也能找到路径,我会及时通知您。也许这,是有帮助的。

list_of_tuples = [('ROOT','ROOT', 'shot'),('nsubj','shot', 'I'),('det','elephant', 'an'),('dobj','shot', 'elephant'),('case','sleep', 'in'),('nmod:poss','sleep', 'my'),('nmod','shot', 'sleep')]

nodes={}

for i in list_of_tuples:
    rel,parent,child=i
    nodes[child]={'Name':child,'Relationship':rel}

forest=[]

for i in list_of_tuples:
    rel,parent,child=i
    node=nodes[child]

    if parent=='ROOT':# this should be the Root Node
            forest.append(node)
    else:
        parent=nodes[parent]
        if not 'children' in parent:
            parent['children']=[]
        children=parent['children']
        children.append(node)

print forest
Run Code Online (Sandbox Code Playgroud)

输出是一个嵌套字典,

[{'Name': 'shot', 'Relationship': 'ROOT', 'children': [{'Name': 'I', 'Relationship': 'nsubj'}, {'Name': 'elephant', 'Relationship': 'dobj', 'children': [{'Name': 'an', 'Relationship': 'det'}]}, {'Name': 'sleep', 'Relationship': 'nmod', 'children': [{'Name': 'in', 'Relationship': 'case'}, {'Name': 'my', 'Relationship': 'nmod:poss'}]}]}]

以下函数可以帮助您找到根到叶的路径:

def recurse_category(categories,to_find):
    for category in categories: 
        if category['Name'] == to_find:
            return True, [category['Relationship']]
        if 'children' in category:
            found, path = recurse_category(category['children'], to_find)
            if found:
                return True, [category['Relationship']] + path
    return False, []
Run Code Online (Sandbox Code Playgroud)