如何获得sklearn中多类分类的roc auc分数?
# this works
roc_auc_score([0,1,1], [1,1,1])
Run Code Online (Sandbox Code Playgroud)
# this fails
from sklearn.metrics import roc_auc_score
ytest = [0,1,2,3,2,2,1,0,1]
ypreds = [1,2,1,3,2,2,0,1,1]
roc_auc_score(ytest, ypreds,average='macro',multi_class='ovo')
# AxisError: axis 1 is out of bounds for array of dimension 1
Run Code Online (Sandbox Code Playgroud)
我查看了官方文档,但没有解决问题。
我有几条 tidymodels /parsnip 模型性能的 ROC 曲线,我想在一个图中相互展示以进行视觉比较:
roc1 <- structure(list(.threshold = c(-Inf, 0.188422381048697, 0.23446542423272,
0.241282102642437, 0.259726705912688, 0.29097010004365, 0.309897370938121,
0.33607659920306, 0.348797482584728, 0.371543061749991, 0.37849110465008,
0.403024193339376, 0.408074451522232, 0.425203432699806, 0.43288528993523,
0.437168077386449, 0.441435377101706, 0.454812465942723, 0.46890082819098,
0.469324015885685, 0.471191285258535, 0.473285736958109, 0.484067175067965,
0.501634453233048, 0.502895404815678, 0.505260074955513, 0.509400496728661,
0.512826032440735, 0.514474796037162, 0.520894854910534, 0.52482313756493,
0.544137627333669, 0.546168394598085, 0.555557692971751, 0.562118235565918,
0.564565992908277, 0.572138872116962, 0.5792082477202, 0.611888118194463,
0.621908020887883, 0.623655143605973, 0.629887735979754, 0.632025630132792,
0.636193886667259, 0.638203230744601, 0.646775289308722, 0.655148011873394,
0.658581199234482, 0.658707835285112, 0.66292920495746, 0.6753497980617,
0.691520083977918, 0.702288194696498, 0.704440842146043, 0.724494989785773,
0.735933141947951, 0.756427437462373, 0.785412673453098, 0.831367501773009,
0.831554130258554, 0.840204698487284, 0.845340108802608, 0.876022993703215,
Inf), specificity = c(0, 0, 0.032258064516129, 0.0645161290322581, …Run Code Online (Sandbox Code Playgroud) 我直接从这里获取 ROC 代码:http : //scikit-learn.org/stable/auto_examples/plot_roc.html
如您所见,我在 for 循环中将我的类数硬编码为 46,但是即使我将其设置为低至 2,我仍然会收到错误消息。
# Compute ROC curve and ROC area for each class
tpr = dict()
roc_auc = dict()
for i in range(46):
fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_pred[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])
Run Code Online (Sandbox Code Playgroud)
错误是:
Traceback (most recent call last):
File "C:\Users\app\Documents\Python Scripts\gbc_classifier_test.py", line 150, in <module>
fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_pred[:, i])
IndexError: too many indices
Run Code Online (Sandbox Code Playgroud)
y_pred正如你在这里看到的:
array.shape() 给出错误元组不可调用
并且y_test只是一个类似于 y_pred 的一维数组,除了我的问题的真正类。
我不明白,什么有太多的索引?
我有一个逻辑回归模型(使用R)作为
fit6 <- glm(formula = survived ~ ascore + gini + failed, data=records, family = binomial)
summary(fit6)
Run Code Online (Sandbox Code Playgroud)
我正在使用pROC包来绘制ROC曲线并计算6个模型fit1到fit6的AUC.
我已经采用这种方式来绘制一个ROC.
prob6=predict(fit6,type=c("response"))
records$prob6 = prob6
g6 <- roc(survived~prob6, data=records)
plot(g6)
Run Code Online (Sandbox Code Playgroud)
但有没有办法可以在一个图中为所有6条曲线组合ROC并显示所有这些曲线的AUC,如果可能的话还可以显示置信区间.
我想在ggplot图表中添加ROC曲线,但它会返回错误代码.
library(ggplot2)
library(plotROC)
set.seed(2529)
D.ex <- rbinom(200, size = 1, prob = .5)
M1 <- rnorm(200, mean = D.ex, sd = .65)
M2 <- rnorm(200, mean = D.ex, sd = 1.5)
test <- data.frame(D = D.ex, D.str = c("Healthy", "Ill")[D.ex + 1],
M1 = M1, M2 = M2, stringsAsFactors = FALSE)
plot<-ggplot(longtest, aes(d = D, m = M1 )) + geom_roc() + style_roc()
plot
Run Code Online (Sandbox Code Playgroud)
没关系,但如果我添加新的ROC线它的返回错误
plot<-ggplot(longtest, aes(d = D, m = M1 )) + geom_roc() + style_roc()
plot+ggplot(test, aes(d = …Run Code Online (Sandbox Code Playgroud) 我有如下表
id State
1 True
2 False
3 True
4 False
5 False
6 True
7 True
8 False
Run Code Online (Sandbox Code Playgroud)
在显示行之前,我需要计算是非。所以结果应该如下表
id State Yes No
1 True 1 0
2 False 1 1
3 True 2 1
4 False 2 2
5 False 2 3
6 True 3 3
7 True 4 3
8 False 4 4
Run Code Online (Sandbox Code Playgroud)
直到第6行(包括第6行)为止,存在3个False和3个True。有任何想法吗?
我想用 ROC 曲线评估我的分类模型。我正在努力为交叉验证的数据集计算多类 ROC 曲线。由于交叉验证,训练集和测试集没有划分。在下面,您可以看到我已经尝试过的代码。
scaler = StandardScaler(with_mean=False)
enc = LabelEncoder()
y = enc.fit_transform(labels)
vec = DictVectorizer()
feat_sel = SelectKBest(mutual_info_classif, k=200)
n_classes = 3
# Pipeline for computing of ROC curves
clf = OneVsRestClassifier(LogisticRegression(solver='newton-cg', multi_class='multinomial'))
clf = clf.label_binarizer_
pipe = Pipeline([('vectorizer', vec),
('scaler', scaler),
('Logreg', clf),
('mutual_info',feat_sel)])
y_pred = model_selection.cross_val_predict(pipe, instances, y, cv=10)
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(n_classes):
fpr[i], tpr[i], _ = roc_curve(y[:, i], y_pred[:, i])
roc_auc[i] = auc(fpr[i], tpr[i])
# Plot …Run Code Online (Sandbox Code Playgroud) 我正在使用 R 包 randomForest 创建一个可分为三组的模型。
model = randomForest(formula = condition ~ ., data = train, ntree = 2000,
mtry = bestm, importance = TRUE, proximity = TRUE)
Type of random forest: classification
Number of trees: 2000
No. of variables tried at each split: 3
OOB estimate of error rate: 5.71%
Confusion matrix:
lethal mock resistant class.error
lethal 20 1 0 0.04761905
mock 1 37 0 0.02631579
resistant 2 0 9 0.18181818
Run Code Online (Sandbox Code Playgroud)
我试过几个图书馆。例如,使用 ROCR,你不能做三个分类,只能做两个。看:
pred=prediction(predictions,train$condition)
Error in prediction(predictions, train$condition) :
Number …Run Code Online (Sandbox Code Playgroud) 都
pROC::auc(0:1, 1:0)
pROC::auc(0:1, 0:1)
Run Code Online (Sandbox Code Playgroud)
给出AUC 1。
通过更多的实验,似乎总是返回max(AUC,1-AUC)。是否可以更改此选项?我找不到要报告此问题的GitHub存储库。
我正在使用R中的软件包randomForest创建一个模型来将病例分类为疾病(1)或无疾病(0):
classify_BV_100t <- randomForest(bv.disease~., data=RF_input_BV_clean, ntree = 100, localImp = TRUE)
print(classify_BV_100t)
Call:
randomForest(formula = bv.disease ~ ., data = RF_input_BV_clean, ntree = 100, localImp = TRUE)
Type of random forest: classification
Number of trees: 100
No. of variables tried at each split: 53
OOB estimate of error rate: 8.04%
Confusion matrix:
0 1 class.error
0 510 7 0.01353965
1 39 16 0.70909091
Run Code Online (Sandbox Code Playgroud)
我的混淆矩阵显示该模型擅长分类0(无疾病),但非常糟糕,不能分类1(疾病)。
但是当我绘制ROC图时,它给人的印象是该模型相当不错。
这是我绘制ROC的2种不同方法:
library(pROC)
rf.roc<-roc(RF_input_BV_clean$bv.disease, classify_BV_100t$votes[,2])
plot(rf.roc)
auc(rf.roc)
Run Code Online (Sandbox Code Playgroud)(在R中使用插入符号进行训练后,如何在ROC下使用ROC和AUC计算?)
library(ROCR) …Run Code Online (Sandbox Code Playgroud)roc ×10
r ×7
auc ×3
python ×3
scikit-learn ×3
ggplot2 ×2
plot ×2
attributes ×1
graph ×1
numpy ×1
tidymodels ×1