我想比较标准的神经网络方法发挥到了极致学习机分类器(基于ROC指标),使用方法"nnet",并"elm"在R包caret。对于nnet,一切正常,但是使用时method = "elm"出现以下错误:
Error in evalSummaryFunction(y, wts = weights, ctrl = trControl, lev = classLevels, :
train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning messages:
1: In train.default(x, y, weights = w, ...) :
At least one of the class levels are not valid R variables names; This may cause errors if class probabilities are generated because the variables names will be converted to: X1, X2
2: In train.default(x, y, weights = w, ...) :
Class probabilities were requested for a model that does not implement them
Run Code Online (Sandbox Code Playgroud)
当时我也遇到了第一个错误method = "nnet",但是在这里我可以通过使score为因子变量来解决问题。因此,这不是这里的问题。
我对R还是比较陌生,也许错误不大,但是现在我陷入了困境...由于elmNN似乎是相对较新的实现,因此我也找不到关于如何在中使用elm的任何在线信息caret。
gc <- read.table("germanCreditNum.txt")
colnames(gc)[25]<-"score"
gc_inTrain <- createDataPartition(y = gc$score,
## the outcome data are needed
p = .8,
## The percentage of data in the
## training set
list = FALSE)
str(gc_inTrain)
gc_training <- gc[ gc_inTrain,]
gc_testing <- gc[-gc_inTrain,]
nrow(gc_training) ## No of rows
nrow(gc_testing)
gc_training$score <- as.factor(gc_training$score)
gc_ctrl <- trainControl(method = "boot",
repeats = 1,
classProbs = TRUE,
summaryFunction = twoClassSummary)
neuralnetFit <- train(score ~ .,
data = gc_training,
method = "nnet",
trControl = gc_ctrl,
metric = "ROC",
preProc = c("center", "scale"))
neuralnetFit
plot(neuralnetFit)
nnClasses <- predict(neuralnetFit, newdata = gc_testing)
str(nnClasses)
## start with ELM for German Credit
gc_ctrl2 <- trainControl(classProbs = TRUE, summaryFunction = twoClassSummary)
elmFit <- train(score ~ .,
data = gc_training,
method = "elm",
trControl = gc_ctrl2,
metric = "ROC",
preProc = c("center", "scale"))
elmFit
plot(elmFit)
elmClasses <- predict(elmFit, newdata = gc_testing)
str(elmClasses)
elmProbs <- predict(elmFit, newdata = gc_testing, type = "prob")
head(elmProbs)
Run Code Online (Sandbox Code Playgroud)
我不记得为什么没有包括ELM的概率模型(我可能有充分的理由)。您可以使用自定义方法来获取softmax值:
library(caret)
set.seed(1)
dat <- twoClassSim(100)
elm_fun <- getModelInfo("elm")[[1]]
elm_fun$prob <- function (modelFit, newdata, submodels = NULL) {
out <- exp(predict(modelFit, newdata))
t(apply(out, 1, function(x) x/sum(x)))
}
mod <- train(Class ~ ., data = dat,
method = elm_fun,
metric = "ROC",
trControl = trainControl(classProbs = TRUE,
summaryFunction = twoClassSummary))
Run Code Online (Sandbox Code Playgroud)
最高