我正在尝试从测试集上的插入符号中获取最佳模型的 ROC 曲线。我遇到了MLeval一个看起来很方便的包(输出非常彻底,使用几行代码提供了所有需要的指标和图表)。一个很好的例子在这里:https : //stackoverflow.com/a/59134729/12875646
我正在尝试下面的代码,并且能够获得训练集所需的指标/图表,但是当我尝试在测试集上工作时不断出错。
library(caret)
library(MLeval)
data(GermanCredit)
Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]
ctrl <- trainControl(method = "repeatedcv", number = 10, classProbs = TRUE, savePredictions = TRUE)
mod_fit <- train(Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own +
CreditHistory.Critical, data=training, method="glm", family="binomial",
trControl = ctrl, tuneLength = 5, metric = "ROC")
pred <- predict(mod_fit, newdata=testing)
confusionMatrix(data=pred, testing$Class)
test = evalm(mod_fit) # this gives the ROC curve for test set
test1 <- evalm(pred) # I am trying this to calculate the ROC curve for the test set (I understand this should be the final curve to report), but I keep getting this error:
Run Code Online (Sandbox Code Playgroud)
evalm(pred) 中的错误:请提供数据框或插入符号训练对象。
在包网站上,第一个参数可以是一个包含概率和观察数据的数据框。你知道如何使用插入符号准备这个数据框吗? https://www.rdocumentation.org/packages/MLeval/versions/0.1/topics/evalm
谢谢你
更新:
这应该是正确的脚本,除了在一张图上显示多个 ROC 之外,运行良好:
library(caret)
library(MLeval)
data(GermanCredit)
Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]
ctrl <- trainControl(method = "repeatedcv", number = 10, classProbs = TRUE, savePredictions = TRUE)
mod_fit <- train(Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own +
CreditHistory.Critical, data=training, method="glm", family="binomial",
trControl = ctrl, tuneLength = 5, metric = "ROC")
#pred <- predict(mod_fit, newdata=testing, type="prob")
confusionMatrix(data=pred, testing$Class)
test = evalm(mod_fit) # this gives the ROC curve for test set
m1 = data.frame(pred, testing$Class)
test1 <- evalm(m1)
#Train and eval a second model:
mod_fit2 <- train(Class ~ Age + ForeignWorker + Property.RealEstate + Housing.Own,
data=training, method="glm", family="binomial",
trControl = ctrl, tuneLength = 5, metric = "ROC")
pred2 <- predict(mod_fit2, newdata=testing, type="prob")
m2 = data.frame(pred2, testing$Class)
test2 <- evalm(m2)
# Display ROCs for both models in one graph:
compare <- evalm(list(m1, m1), gnames=c('logistic1','logistic2'))
Run Code Online (Sandbox Code Playgroud)
我从这个来源得到了代码的最后一步:https : //www.r-bloggers.com/how-to-easily-make-a-roc-curve-in-r/
但是它只显示一条 ROC 曲线(如果我想显示插入符列车输出,效果很好)
您可以使用以下代码
library(MLeval)
pred <- predict(mod_fit, newdata=testing, type="prob")
test1 <- evalm(data.frame(pred, testing$Class))
Run Code Online (Sandbox Code Playgroud)