matplotlib 中的并排 Wordclouds

ADJ*_*ADJ 5 python matplotlib word-cloud lda

我正在使用WordCloud包来显示由scikit LDA(潜在狄利克雷分配)生成的单词。对于 LDA 生成的每个主题,我都会有一个图表。我希望能够在网格中绘制所有图表以允许并排可视化。本质上,我有一个函数将 LDA 模型作为输入,以及我想要可视化的 LDA 主题,然后绘制一个 wordcloud:

from wordcloud import WordCloud
import matplotlib.pyplot as plt
SEED=0

def topicWordCloud(model, topicNumber, WCmaxWords,WCwidth, WCheight):
    topic = model.components_[topicNumber]
    tupleList = [(tf_feature_names[i],int(topic[i]/topic.sum()*10000)) for i in range(len(topic))]
    wordcloud = WordCloud(width=WCwidth, height=WCheight, max_words=WCmaxWords, random_state=42).generate_from_frequencies(tupleList)
    plt.figure( figsize=(20,10) )
    plt.imshow(wordcloud)
    plt.axis("off")

topicWordCloud(model=lda, topicNumber=2, WCmaxWords=100,WCwidth=800, WCheight=600)
Run Code Online (Sandbox Code Playgroud)

如何遍历我的所有主题 ( n_topics) 以在网格中可视化所有图表?我在想一些事情:

fig = plt.figure()
for i in range(n_topics):
    plt.subplot(2,1,i+1) 
    #something here
Run Code Online (Sandbox Code Playgroud)

tmd*_*son 5

从您的函数返回 wordcloud,然后topicWordCloud从您的 for 循环中调用。然后,imshowAxes您创建的fig.add_subplot. 例如,这样的事情:

def topicWordCloud(model, topicNumber, WCmaxWords,WCwidth, WCheight):
    topic = model.components_[topicNumber]
    tupleList = [(tf_feature_names[i],int(topic[i]/topic.sum()*10000)) for i in range(len(topic))]
    wordcloud = WordCloud(width=WCwidth, height=WCheight, max_words=WCmaxWords, random_state=42).generate_from_frequencies(tupleList)
    return wordcloud

fig = plt.figure()
for i in range(n_topics):
    ax = fig.add_subplot(2,1,i+1)
    wordcloud = topicWordCloud(...)

    ax.imshow(wordcloud)
    ax.axis('off')
Run Code Online (Sandbox Code Playgroud)