from spacy.en import English
from numpy import dot
from numpy.linalg import norm
parser = English()
# you can access known words from the parser's vocabulary
nasa = parser.vocab['NASA']
# cosine similarity
cosine = lambda v1, v2: dot(v1, v2) / (norm(v1) * norm(v2))
# gather all known words, take only the lowercased versions
allWords = list({w for w in parser.vocab if w.has_repvec and w.orth_.islower() and w.lower_ != "nasa"})
# sort by similarity to NASA
allWords.sort(key=lambda w: cosine(w.repvec, nasa.repvec))
allWords.reverse()
print("Top 10 most similar words to NASA:")
for word in allWords[:10]:
print(word.orth_)
Run Code Online (Sandbox Code Playgroud)
我试图运行上面的示例,但我得到以下错误:
Traceback (most recent call last):
File "C:\Users\bulusu.kiran\Documents\WORK\nlp\wordVectors1.py", line 8, in <module>
nasa = parser.vocab['NASA']
File "spacy/vocab.pyx", line 330, in spacy.vocab.Vocab.__getitem__ (spacy/vocab.cpp:7708)
orth = id_or_string TypeError: an integer is required
Run Code Online (Sandbox Code Playgroud)
示例摘自:使用spaCy简介NLP
导致此错误的原因是什么?
小智 6
您使用的是哪个版本的Python?这可能是Unicode错误的结果; 我通过替换使它在Python 2.7中工作
nasa = parser.vocab['NASA']
Run Code Online (Sandbox Code Playgroud)
同
nasa = parser.vocab[u'NASA']
Run Code Online (Sandbox Code Playgroud)
然后你会收到这个错误:
AttributeError: 'spacy.lexeme.Lexeme' object has no attribute 'has_repvec'
Run Code Online (Sandbox Code Playgroud)
有一个在SpaCy回购类似的问题,但这些都可以通过更换固定has_repvec带has_vector和repvec带vector.我也会评论那个GitHub线程.
我使用的完整,更新的代码:
import spacy
from numpy import dot
from numpy.linalg import norm
parser = spacy.load('en')
nasa = parser.vocab[u'NASA']
# cosine similarity
cosine = lambda v1, v2: dot(v1, v2) / (norm(v1) * norm(v2))
# gather all known words, take only the lowercased versions
allWords = list({w for w in parser.vocab if w.has_vector and w.orth_.islower() and w.lower_ != "nasa"})
# sort by similarity to NASA
allWords.sort(key=lambda w: cosine(w.vector, nasa.vector))
allWords.reverse()
print("Top 10 most similar words to NASA:")
for word in allWords[:10]:
print(word.orth_)
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助!