sto*_*ock 5 python pandas scikit-learn logistic-regression
我logistic regression用于预测。我的预测是0's和1's。在给定数据上训练我的模型之后,以及在训练重要特征时,即X_important_train看到截图。我得到大约 70% 的分数但是当我使用roc_auc_score(X,y)或roc_auc_score(X_important_train, y_train)我得到值错误时:
ValueError: multiclass-multioutput format is not supported
代码:
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import roc_auc_score
# Standarize features
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
# Train the model using the training sets and check score
model.fit(X, y)
model.score(X, y)
model.fit(X_important_train, y_train)
model.score(X_important_train, y_train)
roc_auc_score(X_important_train, y_train)
Run Code Online (Sandbox Code Playgroud)
截屏:
首先,该roc_auc_score函数需要具有相同形状的输入参数。
sklearn.metrics.roc_auc_score(y_true, y_score, average=’macro’, sample_weight=None)
Note: this implementation is restricted to the binary classification task or multilabel classification task in label indicator format.
y_true : array, shape = [n_samples] or [n_samples, n_classes]
True binary labels in binary label indicators.
y_score : array, shape = [n_samples] or [n_samples, n_classes]
Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).
Run Code Online (Sandbox Code Playgroud)
现在,输入是真实分数和预测分数,而不是您在发布的示例中使用的训练和标签数据。 更详细地说,
model.fit(X_important_train, y_train)
model.score(X_important_train, y_train)
# this is wrong here
roc_auc_score(X_important_train, y_train)
Run Code Online (Sandbox Code Playgroud)
你应该是这样的:
y_pred = model.predict(X_test_data)
roc_auc_score(y_true, y_pred)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
12093 次 |
| 最近记录: |