小编Jes*_*ini的帖子

如何在sklearn库的k均值聚类中使用轮廓分数？

我想在脚本中使用轮廓分数，以自动计算来自sklearn的k均值聚类中的聚类数。

import numpy as np
import pandas as pd
import csv
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

filename = "CSV_BIG.csv"

# Read the CSV file with the Pandas lib.
path_dir = ".\\"
dataframe = pd.read_csv(path_dir + filename, encoding = "utf-8", sep = ';' ) # "ISO-8859-1")
df = dataframe.copy(deep=True)

#Use silhouette score
range_n_clusters = list (range(2,10))
print ("Number of clusters from 2 to 9: \n", range_n_clusters)

for n_clusters in range_n_clusters:
    clusterer = KMeans (n_clusters=n_clusters).fit(?)
    preds = clusterer.predict(?)
    centers …

Run Code Online (Sandbox Code Playgroud)

machine-learning k-means python-2.7 scikit-learn silhouette

Jes*_*ini

2019 05-28

8
推荐指数

1
解决办法

1万
查看次数