小编sim*_*mon的帖子

GSDMM 聚类的收敛(短文本聚类)

我正在使用这个GSDMM python 实现来聚类文本消息的数据集。根据初始论文, GSDMM 收敛速度快(大约 5 次迭代)。我也有收敛到一定数量的集群,但是每次迭代仍然有很多消息传递,所以很多消息仍然在改变它们的集群。

我的输出看起来像:

In stage 0: transferred 9511 clusters with 150 clusters populated 
In stage 1: transferred 4974 clusters with 138 clusters populated 
In stage 2: transferred 2533 clusters with 90 clusters populated
….
In stage 34: transferred 1403 clusters with 47 clusters populated 
In stage 35: transferred 1410 clusters with 47 clusters populated 
In stage 36: transferred 1430 clusters with 48 clusters populated 
In stage 37: transferred 1463 clusters with 48 …
Run Code Online (Sandbox Code Playgroud)

python cluster-analysis topic-modeling convergence

6
推荐指数
1
解决办法
703
查看次数