我可以在python中使用NLTK从Spacy Dependency树中找到主题吗?

use*_*923 0 python nlp spacy

我想从一个句子中找到主题Spacy.下面的代码工作正常并给出依赖树.

import spacy
from nltk import Tree

en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

def to_nltk_tree(node):
    if node.n_lefts + node.n_rights > 0:
        return Tree(node.orth_, [to_nltk_tree(child) for child in node.children])
    else:
        return node.orth_


[to_nltk_tree(sent.root).pretty_print() for sent in doc.sents]
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

从这个依赖树代码,我可以找到这句话的主题吗?

小智 12

我不确定您是否要使用nltk解析树编写代码(请参阅如何识别句子的主题?).但是,spacy也使用word.dep_属性的'nsubj'标签生成它.

import spacy
from nltk import Tree

en_nlp = spacy.load('en')

doc = en_nlp("The quick brown fox jumps over the lazy dog.")

sentence = next(doc.sents) 
for word in sentence:
...     print "%s:%s" % (word,word.dep_)
... 
The:det
quick:amod
brown:amod
fox:nsubj
jumps:ROOT
over:prep
the:det
lazy:amod
dog:pobj
Run Code Online (Sandbox Code Playgroud)

提醒可能存在多个复杂情况.

>>> doc2 = en_nlp(u'When we study hard, we usually do well.')
>>> sentence2 = next(doc2.sents)
>>> for word in sentence2:
...     print "%s:%s" %(word,word.dep_)
... 
When:advmod
we:nsubj
study:advcl
hard:advmod
,:punct
we:nsubj
usually:advmod
do:ROOT
well:advmod
.:punct
Run Code Online (Sandbox Code Playgroud)