使用肘部方法和 K 均值聚类找到最佳聚类数

Question

使用肘部方法和 K 均值聚类找到最佳聚类数

M S*_*ava 2 python k-means google-colaboratory

我正在编写一个程序，需要对一些> 200、300 元素数组的数据集应用 K 均值聚类。有人可以给我提供一个代码链接，并解释一下- 1. 通过肘部方法找到 k 2. 应用 k 均值方法并获取质心数组

我自己搜索了上面的内容，但没有找到任何对代码有明确解释的内容。PS我正在Google Colab工作，所以如果有相同的具体方法，请提出建议

我尝试了下面的代码，但是，我不断收到以下错误 -

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

TypeError: float() argument must be a string or a number, not 'list'


The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)

<ipython-input-70-68e300fd4bf8> in <module>()
     24 
     25 # step 1: find optimal k (number of clusters)
---> 26 find_best_k()
     27 

3 frames

/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: setting an array element with a sequence.

Run Code Online (Sandbox Code Playgroud)

Answer 1

len*_*ova 5

作为 Roohollah 答案的补充：请注意，用于查找 K 均值最佳簇数的肘法纯粹是视觉上的，结果可能不明确。因此，您可能希望将其与轮廓分析结合起来，例如以下文章中所述的：选择适当的聚类数量 (RealPython)、轮廓方法 - 包括 Python 中的实现示例 (TowardsDataScience)、轮廓分析示例 ( Scikit-learn）、 Silhouette（维基百科）。

归档时间：	5 年，5 月前
查看次数：	1931 次
最近记录：	4 年，7 月前