我的作业文档中的NLP示例崩溃了

Question

我的作业文档中的NLP示例崩溃了

所以我是 NLP 新手，我正在尝试我的文档中的示例代码，但它给了我错误。

例如

“ModelsWarning：[W007]您使用的模型没有加载词向量，因此Token的结果。相似度方法将基于标记器、解析器和NER，这可能无法给出有用的相似度判断。这种情况可能会发生如果您使用的是小型型号之一，例如en_core_web_sm，它不附带词向量，仅使用上下文相关的张量。您可以随时添加自己的词向量，或者使用较大的模型之一（如果可用）”

我从文档中准确输入的第二句话给出了这个错误

“只能将 str （不是“numpy.float64”）连接到 str”

我可能只是做了一些愚蠢的事情，但我希望能得到一些见解，为什么会发生这种情况

import spacy
nlp = spacy.load('en')

tokens = nlp('cat apple monkey banana')

for token1 in tokens:
    for token2 in tokens:
        print(token1.text, token2.text, token1.similarity(token2))


print("\nWorking With Sentences\n")

sentence_to_compare = 'Why is my cat on the car'

sentences = ["Where did my dog go",
             'hello, where is my car',
             'I\'ve lost my car in my car',
             'i\'d like my boat back',
             'I will name my dog Diana'
             ]

model_sentences = nlp(sentence_to_compare)

for sentence in sentences:
    similarity = nlp(sentence).similarity(model_sentences)
    print(sentence + "-" + similarity)

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 7

SpaCy\xe2\x80\x99s 小模型(en_core_web_sm, en) 没有给出相似性方法的最佳结果，因为它们不附带单词向量张量。这就是您在控制台中收到警告的原因。所以，我认为你应该使用 en_core_web_lg 而不是小模型。

\n

归档时间：	6 年，5 月前
查看次数：	3351 次
最近记录：	3 年，10 月前