Python 中的 Word Mover 距离

Question

Python 中的 Word Mover 距离

Ski*_*ish 3 python text nlp information-retrieval python-3.x

我正在尝试使用 WMD 计算 2 个文本的相似度。我尝试在 Python 3 中使用以下代码，使用 gensim：

word2vec_model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
word2vec_model.init_sims(replace=True) # normalizes vectors
distance = word2vec_model.wmdistance("string 1", "string 2")  # Compute WMD as normal.

Run Code Online (Sandbox Code Playgroud)

但是，我认为这并没有给我带来正确的价值。我应该如何在 python 中做到这一点？

Answer 1

Hir*_*san 5

请拆分字符串：

distance = word2vec_model.wmdistance("string 1".split(), "string 2".split())
>>> 0.4114476676950455

Run Code Online (Sandbox Code Playgroud)

参数需要是字符串列表。

归档时间：	8 年，9 月前
查看次数：	4026 次
最近记录：	6 年，2 月前