解释 K-Means cluster_centers_ 输出

Question

解释 K-Means cluster_centers_ 输出

Joh*_*tud 2 k-means python-3.x unsupervised-learning

我很难解释cluster_centers_数组输出的结果。

考虑以下 MWE：

from sklearn.cluster import KMeans
from sklearn.datasets import load_iris
import numpy as np

# Load the data
iris = load_iris()
X, y = iris.data, iris.target

# shuffle the data
shuffle = np.random.permutation(np.arange(X.shape[0]))
X = X[shuffle]

# scale X
X = (X - X.mean()) / X.std()

# plot K-means centroids
km = KMeans(n_clusters = 2, n_init = 10)  # establish the model

# fit the data
km.fit(X);

# km centers
km.cluster_centers_

Run Code Online (Sandbox Code Playgroud)

array([[ 1.43706001, -0.29278015,  0.75703227, -0.89603057],
       [ 0.78079175, -0.04797174, -0.96467783, -1.60799713]])

Run Code Online (Sandbox Code Playgroud)

在上面的数组中，我不清楚如何使用这些值来识别聚类中心。我告诉 K-Means 给我 2 个集群，但它为我返回 8 个值，但它们不能是所有 4 个特征的 x、y 坐标。

如果我情节1.43706001, -0.29278015; 这很直观，它是一个位于预测集群中间的集群。

因此，如果是这种情况，并且我的第二个集群是0.78079175, -0.04797174，那么第 2 列和第 3 列中的值是什么？

Answer 1

Kat*_*ova 6

来自文档 cluster_centers_：ndarray of shape (n_clusters, n_features)

iris 数据库有 4 个特征（X.shape = (150,4)），你要 Kmeans 得到 4 维特征空间中的两个质心。cluster_centers_正是如此，列表的每个条目都对应于 R^4 中质心的坐标。

归档时间：	5 年，10 月前
查看次数：	1620 次
最近记录：	5 年，10 月前