如何从nltk WordNet Python中获取同义词

use*_*113 27 python nltk wordnet

WordNet很棒,但我很难在nltk中获取同义词.如果您在此处搜索类似于"小"的单词,则会显示所有同义词.

基本上我只需要知道以下内容: wn.synsets('word')[i].option()where选项可以是上位词和反义词,但获取同义词的选项是什么?

小智 41

如果你想在synset中使用同义词(也就是构成集合的lemmas),你可以使用lemma_names():

>>> for ss in wn.synsets('small'):
>>>     print(ss.name(), ss.lemma_names())

small.n.01 ['small']
small.n.02 ['small']
small.a.01 ['small', 'little']
minor.s.10 ['minor', 'modest', 'small', 'small-scale', 'pocket-size',  'pocket-sized']
little.s.03 ['little', 'small']
small.s.04 ['small']
humble.s.01 ['humble', 'low', 'lowly', 'modest', 'small']    
...
Run Code Online (Sandbox Code Playgroud)

  • OP真的应该把这个答案标记为正确. (6认同)

aba*_*ert 9

You've already got the synonyms. That's what a Synset is.

>>> wn.synsets('small')
[Synset('small.n.01'),
 Synset('small.n.02'),
 Synset('small.a.01'),
 Synset('minor.s.10'),
 Synset('little.s.03'),
 Synset('small.s.04'),
 Synset('humble.s.01'),
 Synset('little.s.07'),
 Synset('little.s.05'),
 Synset('small.s.08'),
 Synset('modest.s.02'),
 Synset('belittled.s.01'),
 Synset('small.r.01')]
Run Code Online (Sandbox Code Playgroud)

That's the same list of top-level entries that the web interface gave you.

If you also want the "similar to" list, that's not the same thing as the synonyms. For that, you call similar_tos() on each Synset.

So, to show the same information as the website, start with something like this:

for ss in wn.synsets('small'):
    print(ss)
    for sim in ss.similar_tos():
        print('    {}'.format(sim))
Run Code Online (Sandbox Code Playgroud)

当然,该网站还为两个级别的每个synset 打印词性(sim.pos),词条列表(sim.lemma_names),定义(sim.definition)和示例(sim.examples).它通过词性对它们进行分组,并添加到您可以遵循的其他内容的链接中,等等.但这应该足以让你开始.

  • 这篇文章的建议,即`wn.synsets('word')`返回"word"的同义词是完全错误的.相反,该函数返回"word"的不同语义概念的列表.概念的同义词resp.synset可以被`wn.synsets('word')[i] .lemmas()`接收. (17认同)
  • @charbugs,我同意:这个答案是错误的.例如,"wiz"是"whiz"的一种感觉的同义词,即它是具有不同拼写但具有相同含义的单词.如果我们评论的答案是正确的,那么`wn.synsets('whiz')`的输出将包括"wiz",但事实并非如此.但是,对于wn.synsets('whiz')中的synset的输出:print synset.lemma_names()`*do*include"wiz". (2认同)

Kas*_*mvd 9

您可以使用wordnet.synsetlemmas来获取所有同义词:

例如:

from itertools import chain
from nltk.corpus import wordnet

synonyms = wordnet.synsets(text)
lemmas = set(chain.from_iterable([word.lemma_names() for word in synonyms]))
Run Code Online (Sandbox Code Playgroud)

演示:

>>> synonyms = wordnet.synsets('change')
>>> set(chain.from_iterable([word.lemma_names() for word in synonyms]))
set([u'interchange', u'convert', u'variety', u'vary', u'exchange', u'modify', u'alteration', u'switch', u'commute', u'shift', u'modification', u'deepen', u'transfer', u'alter', u'change'])
Run Code Online (Sandbox Code Playgroud)