运行gensim的LDA模型时出现运行时错误，如何修复？

Question

运行gensim的LDA模型时出现运行时错误，如何修复？

B61*_*612 1 model runtime-error runtime lda gensim

我有一个运行时错误：

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
  0%|          | 0/29 [00:48<?, ?it/s]

Run Code Online (Sandbox Code Playgroud)

当我尝试运行此代码时：

def topic_model_coherence_generator (corpus, texts, dictionary, start_topic_count=2, end_topic_count=10, step=1, cpus=1):
    models=[]
    coherence_scores = []
    for topic_nums in tqdm(range(start_topic_count, end_topic_count+1, step)):
        lda_model = gensim.models.LdaModel(corpus=bow_corpus, id2word=dictionary, chunksize=1740, alpha='auto', eta='auto',
                                   random_state=42, iterations=500, num_topics=topic_nums, passes=20, eval_every=None)

        cv_coherence_model_lda = gensim.models.CoherenceModel(model=lda_model, corpus=bow_corpus,
                                                      texts=norm_corpus_bigrams, dictionary=dictionary,
                                                      coherence='c_v')

        coherence_score= cv_coherence_model_lda.get_coherence()
        coherence_scores.append(coherence_score)
        models.append(lda_model)
    return models, coherence_scores

lda_models, coherence_scores = topic_model_coherence_generator(corpus=bow_corpus,
                                                               texts=norm_corpus_bigrams,
                                                               dictionary= dictionary,
                                                               start_topic_count=2,
                                                               end_topic_count=30,
                                                               step=1, cpus=16)

Run Code Online (Sandbox Code Playgroud)

我想要的是获得我的语料库的最佳主题数量，以便获得主题并解释主题模型结果。我是生物学家，所以我不知道如何解决它。感谢您的帮助

Answer 1

goj*_*omo 5

这是一个很好的做法，在 Windows 上可能需要在使用其他使用 Python 的代码之前multiprocessing将代码放入“main”块中。更多详情请参阅答案：

\n\n

/sf/answers/4232196461/

\n\n

（\xe2\x80\xa6&可能是它引用的另一个答案）。

\n

归档时间：	5 年，11 月前
查看次数：	1562 次
最近记录：	5 年，11 月前