有没有办法找到使用NLTK WordNet的专有名词?也就是说,我能用nltk Wordnet标记占有名词吗?
我正在尝试使用Stanford POS标记器和NER编写关键字提取程序.对于关键字提取,我只对专有名词感兴趣.这是基本方法
示例代码
docText="'Jack Frost works for Boeing Company. He manages 5 aircraft and their crew in London"
words = re.split("\W+",docText)
stops = set(stopwords.words("english"))
#remove stop words from the list
words = [w for w in words if w not in stops and len(w) > 2]
# Stemming
pstem = PorterStemmer()
words = [pstem.stem(w) for w in words]
nounsWeWant = set(['NN' ,'NNS', 'NNP', 'NNPS'])
finalWords = []
stn = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz')
stp = StanfordPOSTagger('english-bidirectional-distsim.tagger')
for …Run Code Online (Sandbox Code Playgroud)