我正在尝试使用BERTopic分析文档的主题分布,BERTopic执行后,我想计算每个文档各自主题下的概率,我应该怎么做?
# define model
model = BERTopic(verbose=True,
vectorizer_model=vectorizer_model,
embedding_model='paraphrase-MiniLM-L3-v2',
min_topic_size= 50,
nr_topics=10)
# train model
headline_topics, _ = model.fit_transform(df1.review_processed3)
# examine one of the topic
a_topic = freq.iloc[0]["Topic"] # Select the 1st topic
model.get_topic(a_topic) # Show the words and their c-TF-IDF scores
Run Code Online (Sandbox Code Playgroud)
下面是主题图像 1之一的单词及其 c-TF-IDF 分数
我应该如何将结果更改为如下主题分布,以便计算主题分布分数并确定主要主题? 图2