标签: word2vec

from gensim.models import word2vec, Phrases
documents = ["the mayor of new york was there", "human computer interaction and machine learning has now become a trending research area","human computer interaction is interesting","human computer interaction is a pretty interesting subject", "human computer interaction is a great and new subject", "machine learning can be useful sometimes","new york mayor was present", "I love machine learning because it is a new subject area", "human computer interaction helps people to get user friendly applications"]
sentence_stream …

Run Code Online (Sandbox Code Playgroud)

python gensim word2vec

作者

2017 09-27

5
推荐指数

1
解决办法

1401
查看次数

在Google合作实验室上安装faiss

我尝试遵循有关MUSE项目的说明。

他们需要PyTorch和Faiss。PyTorch易于安装。但是我发现安装Faiss存在问题。

关于MUSE的说明告诉我使用

conda install faiss-cpu -c pytorch

Run Code Online (Sandbox Code Playgroud)

但是Google Colab不支持conda（当我尝试过时!pip install conda，它不起作用）

当我!pip install faiss也去的时候，费斯没有工作。

有没有办法安装Faiss或conda？

python pip conda word2vec google-colaboratory

Kor*_*ich

lucky-day

5
推荐指数

3
解决办法

2025
查看次数

退出代码为134的过程结束（被信号6：SIGABRT中断）

我正在研究node2vec。当我使用小型数据集时，代码运行良好。但是，一旦我尝试在大型数据集上运行相同的代码，代码就会崩溃。

错误：进程结束，退出代码为134（被信号6：SIGABRT中断）。

给出错误的行是

model = Word2Vec(walks, size=args.dimensions, window=args.window_size, min_count=0, sg=1, workers=args.workers,
                 iter=args.iter)

Run Code Online (Sandbox Code Playgroud)

我正在使用pycharm和python 3.5。

知道发生了什么吗？我找不到任何可以解决我问题的帖子。

python pycharm gensim word2vec

Mr *_*ohi

lucky-day

5
推荐指数

1
解决办法

2万
查看次数

gensim-Word2vec继续对现有模型进行训练-AttributeError：'Word2Vec'对象没有属性'compute_loss'

我正在尝试继续在现有模型上进行训练，

model = gensim.models.Word2Vec.load('model/corpus.zhwiki.word.model')
more_sentences = [['Advanced', 'users', 'can', 'load', 'a', 'model', 'and', 'continue', 'training', 'it', 'with', 'more', 'sentences']]    
model.build_vocab(more_sentences, update=True)
model.train(more_sentences, total_examples=model.corpus_count, epochs=model.iter)

Run Code Online (Sandbox Code Playgroud)

但最后一行出现错误：

AttributeError：'Word2Vec'对象没有属性'compute_loss'

一些帖子说，这是由于使用了较早版本的gensim引起的，我尝试在加载现有模型之后且在train（）之前添加它。

model.compute_loss = False

Run Code Online (Sandbox Code Playgroud)

之后，它没有给我AttributeError，但是model.train（）的输出为0，并且模型没有使用新的句子进行训练。

如何解决这个问题呢？

python nlp gensim word2vec

did*_*isy

2019 10-24

5
推荐指数

2
解决办法

2519
查看次数

在 Keras 上合并层（点积）

我一直在关注 Towards Data Science 关于 word2vec 和 skip-gram 模型的教程，但我偶然发现了一个我无法解决的问题，尽管搜索了几个小时并尝试了很多不成功的解决方案。

https://towardsdatascience.com/understanding-feature-engineering-part-4-deep-learning-methods-for-text-data-96c44370bbfa

由于使用了 keras.layers 中的 Merge 层，它向您展示了如何构建 skip-gram 模型架构的步骤似乎已被弃用。

我似乎对此进行了很多讨论，大多数答案是您现在需要使用 Keras 的功能 API 来合并层。但问题是，我是 Keras 的初学者，不知道如何将我的代码从 Sequential 转换为 Functional，这是作者使用的代码（我复制了）：

from keras.layers import Merge
from keras.layers.core import Dense, Reshape
from keras.layers.embeddings import Embedding
from keras.models import Sequential

# build skip-gram architecture
word_model = Sequential()
word_model.add(Embedding(vocab_size, embed_size,
                         embeddings_initializer="glorot_uniform",
                         input_length=1))
word_model.add(Reshape((embed_size, )))

context_model = Sequential()
context_model.add(Embedding(vocab_size, embed_size,
                  embeddings_initializer="glorot_uniform",
                  input_length=1))
context_model.add(Reshape((embed_size,)))

model = Sequential()
model.add(Merge([word_model, context_model], mode="dot"))
model.add(Dense(1, kernel_initializer="glorot_uniform", activation="sigmoid"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")

# view model summary …

Run Code Online (Sandbox Code Playgroud)

python word2vec keras tensorflow word-embedding

Luc*_*edo

2018 09-28

5
推荐指数

1
解决办法

1万
查看次数

无法在 <module 'gensim.models.keyedvectors' > 上获取属性 'Word2VecKeyedVectors'

我训练并保存了一个 gensim word2vec 模型：

W2V_MODEL_FN = r"C:\Users\models\w2v.model"

model = Word2Vec(X, size=150, window=3, min_count=2, workers=10)
model.train(X, total_examples=len(X), epochs=50)
model.save(W2V_MODEL_FN)

Run Code Online (Sandbox Code Playgroud)

进而：

w2v_model = Word2Vec.load(W2V_MODEL_FN)

Run Code Online (Sandbox Code Playgroud)

在一种环境中它可以完美运行，但在另一种环境中我得到错误：

{AttributeError}无法从“C:\Users\Anaconda3_New\envs\ISP_env\lib\site-packages\gensim\models\keyedvectors.py”中获取模块“gensim.models.keyedvectors”上的属性“Word2VecKeyedVectors”

所以我想这可能是一个包版本问题？

但我无法弄清楚它是什么。有任何想法吗？

谢谢！

python nlp gensim word2vec

ore*_*isp

lucky-day

5
推荐指数

1
解决办法

3471
查看次数