Cen*_*tAu 6 python tree nltk parse-tree
我使用nltk的Tree数据结构来处理parsetree字符串.
from nltk.tree import Tree
parsed = Tree('(ROOT (S (NP (PRP It)) (VP (VBZ is) (ADJP (RB so) (JJ nice))) (. .)))')
Run Code Online (Sandbox Code Playgroud)
但是,数据结构似乎有限.是否可以通过它的字符串值获取节点然后导航到顶部或底部?
例如,假设您想要获取字符串值为"nice"的节点,然后查看其父节点,子节点等是什么.是否可以通过nltk的树实现?
Jes*_*sme 12
对于NLTK 3.0,您希望使用ParentedTree子类.
http://www.nltk.org/api/nltk.html#nltk.tree.ParentedTree
使用您给出的示例树,创建ParentedTree并搜索所需的节点:
from nltk.tree import ParentedTree
ptree = ParentedTree.fromstring('(ROOT (S (NP (PRP It)) \
(VP (VBZ is) (ADJP (RB so) (JJ nice))) (. .)))')
leaf_values = ptree.leaves()
if 'nice' in leaf_values:
leaf_index = leaf_values.index('nice')
tree_location = ptree.leaf_treeposition(leaf_index)
print tree_location
print ptree[tree_location]
Run Code Online (Sandbox Code Playgroud)
您可以直接遍历树以获取子树.parent()方法用于查找给定子树的父树.
这是一个为子和父使用更深层树的示例:
from nltk.tree import ParentedTree
ptree = ParentedTree.fromstring('(ROOT (S (NP (JJ Congressional) \
(NNS representatives)) (VP (VBP are) (VP (VBN motivated) \
(PP (IN by) (NP (NP (ADJ shiny) (NNS money))))))) (. .))')
def traverse(t):
try:
t.label()
except AttributeError:
return
else:
if t.height() == 2: #child nodes
print t.parent()
return
for child in t:
traverse(child)
traverse(ptree)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
10023 次 |
| 最近记录: |