通过插入符号返回奇怪的值来调整mtry

Question

通过插入符号返回奇怪的值来调整mtry

我调整了使用包中的函数的mtry参数.我的数据中只有列,但返回为最佳值,而这不是有效值().那是什么解释？randomForesttraincaret48Xtrainmtry=50>48

> dim(X)
[1] 93 48
> fit <- train(level~., data=data.frame(X,level), tuneLength=13) 
> fit$finalModel

Call:
 randomForest(x = x, y = y, mtry = param$mtry) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 50

        OOB estimate of  error rate: 2.15%
Confusion matrix:
     high low class.error
high   81   1  0.01219512
low     1  10  0.09090909

Run Code Online (Sandbox Code Playgroud)

如果我不设置tuneLength参数,情况会更糟:

> fit <- train(level~., data=data.frame(X,level)) 
> fit$finalModel 

Call:
 randomForest(x = x, y = y, mtry = param$mtry) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 55

        OOB estimate of  error rate: 2.15%
Confusion matrix:
     high low class.error
high   81   1  0.01219512
low     1  10  0.09090909

Run Code Online (Sandbox Code Playgroud)

我不提供数据,因为它是保密的.但是这些数据并没有什么特别之处:每列都是数字或是一个因子,并且没有缺失值.

Answer 1

top*_*epo 6

数据集中的列数与预测变量的数量之间的明显差异很可能是[1],如果任何列都是因子,则可能不一样.您使用了公式方法,它将因子扩展为虚拟变量.例如:

> head(model.matrix(Sepal.Width ~ ., data = iris))
  (Intercept) Sepal.Length Petal.Length Petal.Width Speciesversicolor Speciesvirginica
1           1          5.1          1.4         0.2                 0                0
2           1          4.9          1.4         0.2                 0                0
3           1          4.7          1.3         0.2                 0                0
4           1          4.6          1.5         0.2                 0                0
5           1          5.0          1.4         0.2                 0                0
6           1          5.4          1.7         0.4                 0                0

Run Code Online (Sandbox Code Playgroud)

因此,有3个预测变量列,iris但最终会有5个(非截距)预测变量.

马克斯

[1]这就是为什么你需要提供一个可重复的例子.通常,当我准备提出问题时,答案会变得明显,而我会花时间写出对问题的良好描述.

归档时间：	11 年，4 月前
查看次数：	2128 次
最近记录：	10 年，6 月前