R caret train evalSummaryFunction中的错误:无法计算回归的类概率

tuc*_*son 4 r r-caret

> cv.ctrl <- trainControl(method = "repeatedcv", repeats = 3,
+                         summaryFunction = twoClassSummary,
+                         classProbs = TRUE)
> 
> set.seed(35)
> glm.tune.1 <- train(y ~ bool_3,
+                     data = train.batch,
+                     method = "glm",
+                     metric = "ROC",
+                     trControl = cv.ctrl)
Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression


 > str(train.batch)
'data.frame':   128046 obs. of  42 variables:
 $ offer               : int  1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 ...
 $ avgPrice            : num  2.68 2.68 2.68 2.68 2.68 ...
 ...
 $ bool_3              : int  0 0 0 0 0 0 0 1 0 0 ...
 $ y                   : num  0 1 0 0 0 1 1 1 1 0 ...
Run Code Online (Sandbox Code Playgroud)

由于cv.ctrl的classProbs设置为TRUE,我不明白为什么会出现此错误消息.

有人可以提供建议吗?

tuc*_*son 6

显然这个错误是因为我的y不是一个因素.

以下代码工作正常:

library(caret)
library(mlbench)
data(Sonar)

ctrl <- trainControl(method = "cv", 
                     summaryFunction = twoClassSummary, 
                     classProbs = TRUE)
set.seed(1)
gbmTune <- train(Class ~ ., data = Sonar,
                 method = "gbm",
                 metric = "ROC",
                 verbose = FALSE,                    
                 trControl = ctrl)
Run Code Online (Sandbox Code Playgroud)

然后做:

Sonar$Class = as.numeric(Sonar$Class)
Run Code Online (Sandbox Code Playgroud)

和相同的代码抛出错误:

> gbmTune <- train(Class ~ ., data = Sonar,
+                  method = "gbm",
+                  metric = "ROC",
+                  verbose = FALSE,                    
+                  trControl = ctrl)
Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression
Run Code Online (Sandbox Code Playgroud)

但是,插入火车文件说:

y   a numeric or factor vector containing the outcome for each sample.
Run Code Online (Sandbox Code Playgroud)

  • `train`适用于回归(当`y`是数字时)和分类(当它是一个因子时). (5认同)