我想计算分类器的AUC,精度和准确度.我正在监督学习:
这是我的工作代码.此代码适用于二进制类,但不适用于多类.请假设您有一个包含二进制类的数据框:
sample_features_dataframe = self._get_sample_features_dataframe()
labeled_sample_features_dataframe = retrieve_labeled_sample_dataframe(sample_features_dataframe)
labeled_sample_features_dataframe, binary_class_series, multi_class_series = self._prepare_dataframe_for_learning(labeled_sample_features_dataframe)
k = 10
k_folds = StratifiedKFold(binary_class_series, k)
for train_indexes, test_indexes in k_folds:
train_set_dataframe = labeled_sample_features_dataframe.loc[train_indexes.tolist()]
test_set_dataframe = labeled_sample_features_dataframe.loc[test_indexes.tolist()]
train_class = binary_class_series[train_indexes]
test_class = binary_class_series[test_indexes]
selected_classifier = RandomForestClassifier(n_estimators=100)
selected_classifier.fit(train_set_dataframe, train_class)
predictions = selected_classifier.predict(test_set_dataframe)
predictions_proba = selected_classifier.predict_proba(test_set_dataframe)
roc += roc_auc_score(test_class, predictions_proba[:,1])
accuracy += accuracy_score(test_class, predictions)
recall += recall_score(test_class, predictions)
precision += precision_score(test_class, predictions)
Run Code Online (Sandbox Code Playgroud)
最后我将结果分成K当然是为了获得平均AUC,精度等.这段代码工作正常.但是,我不能为多类计算相同:
train_class = multi_class_series[train_indexes]
test_class = multi_class_series[test_indexes]
selected_classifier = RandomForestClassifier(n_estimators=100)
selected_classifier.fit(train_set_dataframe, train_class)
predictions = selected_classifier.predict(test_set_dataframe)
predictions_proba …Run Code Online (Sandbox Code Playgroud)