Scikit Learn - 如何绘制概率

Question

Scikit Learn - 如何绘制概率

Kay*_*Kay 4 python matplotlib scikit-learn

我想绘制模型预测概率。

plt.scatter(y_test, prediction[:,0])
plt.xlabel("True Values")
plt.ylabel("Predictions")
plt.show()

Run Code Online (Sandbox Code Playgroud)

但是，我得到了像上面这样的图表。哪种有意义，但我想更好地可视化概率分布。有没有一种方法可以让我的实际类为 0 或 1 并且预测介于 0 和 1 之间。

Answer 1

Ven*_*lam 7

预测概率可用于可视化模型性能。真正的标签可以使用颜色来表示。

试试这个例子：

from sklearn.datasets import make_classification
import matplotlib.pyplot as plt

X, y = make_classification(n_samples=1000, n_features=4,
                           n_informative=2, n_redundant=0,
                           random_state=1, shuffle=False)
from sklearn.linear_model import LogisticRegression

lr=LogisticRegression(random_state=0, solver='lbfgs', max_iter=10)
lr.fit(X, y)

prediction=lr.predict_proba(X)[:,1]

plt.figure(figsize=(15,7))
plt.hist(prediction[y==0], bins=50, label='Negatives')
plt.hist(prediction[y==1], bins=50, label='Positives', alpha=0.7, color='r')
plt.xlabel('Probability of being Positive Class', fontsize=25)
plt.ylabel('Number of records in each bucket', fontsize=25)
plt.legend(fontsize=15)
plt.tick_params(axis='both', labelsize=25, pad=5)
plt.show()

Run Code Online (Sandbox Code Playgroud)

Answer 2

ml4*_*294 1

您可以根据真实值拆分值，然后绘制两个类的值的两个直方图，例如使用以下内容（至少如果您有 numpy 数组arr_true并且arr_pred这应该有效）：

arr_true_0_indices = (y_test == 0.0)
arr_true_1_indices = (y_test == 1.0)

arr_pred_0 = prediction[arr_true_0_indices]
arr_pred_1 = prediction[arr_true_1_indices]

plt.hist(arr_pred_0, bins=40, label='True class 0', normed=True, histtype='step')
plt.hist(arr_pred_1, bins=40, label='True class 1', normed=True, histtype='step')
plt.xlabel('Network output')
plt.ylabel('Arbitrary units / probability')
plt.legend(loc='best')
plt.show()

Run Code Online (Sandbox Code Playgroud)

这应该会导致如下结果：

归档时间：	8 年，6 月前
查看次数：	3107 次
最近记录：	6 年，8 月前