我是 R 的新手,我在使用 R 预测命令时遇到了问题。我收到此错误
Error in `[.data.frame`(newdata, , as.character(object$formula[[2]])) :
undefined columns selected
Run Code Online (Sandbox Code Playgroud)
当我执行此命令时:
model.predict <- predict.boosting(model,newdata=test)
Run Code Online (Sandbox Code Playgroud)
这是我的模型:
model <- boosting(Y~x1+x2+x3+x4+x5+x6+x7, data=train)
Run Code Online (Sandbox Code Playgroud)
这是我的测试数据的结构:str(test)
'data.frame': 343 obs. of 7 variables:
$ x1: Factor w/ 4 levels "Americas","Asia_Pac",..: 4 2 4 2 4 3 3 3 4 1 ...
$ x2: Factor w/ 5 levels "Fifth","First",..: 3 3 2 2 4 2 4 4 1 1 ...
$ x3: Factor w/ 3 levels "Best","Better",..: 2 3 1 1 3 2 2 1 3 3 ...
$ x4: Factor w/ 2 levels "Female","Male": 1 1 2 1 1 2 1 2 2 2 ...
$ x5: int 82 55 47 31 6 53 77 68 76 86 ...
$ x6: num 22.8 14.6 25.5 38.3 7.9 32.8 4.6 34.2 36.7 21.7 ...
$ x7: num 0.679 0.925 0.897 0.684 0.195 ...
Run Code Online (Sandbox Code Playgroud)
以及我的训练数据的结构:
$ RecordID: int 1 2 3 4 5 6 7 8 9 10 ...
$ x1 : Factor w/ 4 levels "Americas","Asia_Pac",..: 1 2 2 3 1 1 1 2 2 4 ...
$ x2 : Factor w/ 5 levels "Fifth","First",..: 5 5 3 2 5 5 5 4 3 2 ...
$ x3 : Factor w/ 3 levels "Best","Better",..: 2 3 2 2 3 1 2 3 1 1 ...
$ x4 : Factor w/ 2 levels "Female","Male": 1 2 2 2 1 1 2 2 1 1 ...
$ x5 : int 1 67 75 51 84 33 21 80 48 5 ...
$ x6 : num 21 13.8 30.3 11.9 1.7 13.2 33.9 17 3.4 19.5 ...
$ x7 : num 0.35 0.85 0.73 0.39 0.47 0.13 0.2 0.12 0.64 0.11 ...
$ Y : Factor w/ 2 levels "Green","Yellow": 2 2 1 2 2 2 1 2 2 2 ..
Run Code Online (Sandbox Code Playgroud)
我认为测试数据的结构有问题,但我找不到它,或者我对“预测”命令的结构有误解。请注意,如果我对训练数据运行 predict 命令,它会起作用。关于在哪里看的任何建议?
谢谢!
predict.boosting()期望获得测试数据的实际标签,因此它可以计算它的表现如何(如下所示的混淆矩阵)。
library(adabag)
data(iris)
iris.adaboost <- boosting(Species~Sepal.Length+Sepal.Width+Petal.Length+
Petal.Width, data=iris, boos=TRUE, mfinal=10)
# make a 'test' dataframe without the classes, as in the question
iris2 <- iris
iris2$Species <- NULL
# replicates the error
irispred=predict.boosting(iris.adaboost, newdata=iris2)
#Error in `[.data.frame`(newdata, , as.character(object$formula[[2]])) :
# undefined columns selected
Run Code Online (Sandbox Code Playgroud)
这是工作示例,主要来自帮助文件,因此这里有一个工作示例(并演示混淆矩阵)。
# first create subsets of iris data for training and testing
sub <- c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25))
iris3 <- iris[sub,]
iris4 <- iris[-sub,]
iris.adaboost <- boosting(Species ~ ., data=iris3, mfinal=10)
# works
iris.predboosting<- predict.boosting(iris.adaboost, newdata=iris4)
iris.predboosting$confusion
# Observed Class
#Predicted Class setosa versicolor virginica
# setosa 50 0 0
# versicolor 0 50 0
# virginica 0 0 50
Run Code Online (Sandbox Code Playgroud)