迭代后从HashSet中删除失败

wei*_*yin 3 java collections hash iterator hashset

我在java中编写了一个凝聚聚类算法,并且在删除操作时遇到了问题.当簇的数量达到初始数量的一半时,它似乎总是失败.

在下面的示例代码中,clusters是一个Collection<Collection<Integer>>.

      while(clusters.size() > K){
           // determine smallest distance between clusters
           Collection<Integer> minclust1 = null;
           Collection<Integer> minclust2 = null;
           double mindist = Double.POSITIVE_INFINITY;

           for(Collection<Integer> cluster1 : clusters){
                for(Collection<Integer> cluster2 : clusters){
                     if( cluster1 != cluster2 && getDistance(cluster1, cluster2) < mindist){
                          minclust1 = cluster1;
                          minclust2 = cluster2;
                          mindist = getDistance(cluster1, cluster2);
                     }
                }
           }

           // merge the two clusters
           minclust1.addAll(minclust2);
           clusters.remove(minclust2);
      }
Run Code Online (Sandbox Code Playgroud)

经过几次循环后,clusters.remove(minclust2)最终返回false,但我不明白为什么.

我通过首先创建10个集群来测试此代码,每个集群都有一个1到10的整数.距离是0到1之间的随机数.这是添加一些println语句后的输出.在簇数之后,我打印出实际的簇,合并操作以及clusters.remove(minclust2)的结果.

Clustering: 10 clusters
[[3], [1], [10], [5], [9], [7], [2], [4], [6], [8]]
[5] <- [6]
true
Clustering: 9 clusters
[[3], [1], [10], [5, 6], [9], [7], [2], [4], [8]]
[7] <- [8]
true
Clustering: 8 clusters
[[3], [1], [10], [5, 6], [9], [7, 8], [2], [4]]
[10] <- [9]
true
Clustering: 7 clusters
[[3], [1], [10, 9], [5, 6], [7, 8], [2], [4]]
[5, 6] <- [4]
true
Clustering: 6 clusters
[[3], [1], [10, 9], [5, 6, 4], [7, 8], [2]]
[3] <- [2]
true
Clustering: 5 clusters
[[3, 2], [1], [10, 9], [5, 6, 4], [7, 8]]
[10, 9] <- [5, 6, 4]
false
Clustering: 5 clusters
[[3, 2], [1], [10, 9, 5, 6, 4], [5, 6, 4], [7, 8]]
[10, 9, 5, 6, 4] <- [5, 6, 4]
false
Clustering: 5 clusters
[[3, 2], [1], [10, 9, 5, 6, 4, 5, 6, 4], [5, 6, 4], [7, 8]]
[10, 9, 5, 6, 4, 5, 6, 4] <- [5, 6, 4]
false
Run Code Online (Sandbox Code Playgroud)

[10,9,5,6,4,5,6,4 ......]组从那里开始无限增长.

编辑:澄清一下,我正在HashSet<Integer>为群集中的每个群集使用a (a HashSet<HashSet<Integer>>).

Tom*_*ine 5

啊.当您更改已经在Set(或Map键)中的值时,它不一定在正确的位置,并且将缓存哈希码.您需要将其删除,更改它然后重新插入它.