art*_*ras 4 nlp parse-tree stanford-nlp
这可能是一个愚蠢的问题,但是如何迭代解析树作为 NLP 解析器的输出(如斯坦福 NLP)?它都是嵌套的括号,既不是 anarray也不是dictionary我使用过的任何其他集合类型。
(ROOT\n (S\n (PP (IN As)\n (NP (DT an) (NN accountant)))\n (NP (PRP I))\n (VP (VBP want)\n (S\n (VP (TO to)\n (VP (VB make)\n (NP (DT a) (NN payment))))))))
Run Code Online (Sandbox Code Playgroud)
斯坦福解析器的这种特殊输出格式称为“括号解析(树)”。它应该被理解为一个图表
ROOT(在这种情况下,您可以将其视为有向无环图 (DAG),因为它是单向和非循环的)
有一些库可以读取括号内的解析,例如 in NLTK's nltk.tree.Tree( http://www.nltk.org/howto/tree.html ):
>>> from nltk.tree import Tree
>>> output = '(ROOT (S (PP (IN As) (NP (DT an) (NN accountant))) (NP (PRP I)) (VP (VBP want) (S (VP (TO to) (VP (VB make) (NP (DT a) (NN payment))))))))'
>>> parsetree = Tree.fromstring(output)
>>> print parsetree
(ROOT
(S
(PP (IN As) (NP (DT an) (NN accountant)))
(NP (PRP I))
(VP
(VBP want)
(S (VP (TO to) (VP (VB make) (NP (DT a) (NN payment))))))))
>>> parsetree.pretty_print()
ROOT
|
S
______________________|________
| | VP
| | ________|____
| | | S
| | | |
| | | VP
| | | ________|___
PP | | | VP
___|___ | | | ________|___
| NP NP | | | NP
| ___|______ | | | | ___|_____
IN DT NN PRP VBP TO VB DT NN
| | | | | | | | |
As an accountant I want to make a payment
>>> parsetree.leaves()
['As', 'an', 'accountant', 'I', 'want', 'to', 'make', 'a', 'payment']
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4400 次 |
| 最近记录: |