Ant*_*ine 18 r classification machine-learning data-mining gbm
我在R(gbm包)中使用gbm函数来拟合用于多类分类的随机梯度增强模型.我只是试图分别为每个班级获得每个预测因子的重要性,就像哈斯蒂(Hastie)书(统计学习要素)(第382页)中的这张图片一样.

但是,该函数summary.gbm仅返回预测变量的总体重要性(它们对所有类的平均重要性).
有谁知道如何获得相对重要性值?
Hopefully this function helps you. For the example I used data from the ElemStatLearn package. The function figures out what the classes for a column are, splits the data into these classes, runs the gbm() function on each class and plots the bar plots for these models.
# install.packages("ElemStatLearn"); install.packages("gbm")
library(ElemStatLearn)
library(gbm)
set.seed(137531)
# formula: the formula to pass to gbm()
# data: the data set to use
# column: the class column to use
classPlots <- function (formula, data, column) {
class_column <- as.character(data[,column])
class_values <- names(table(class_column))
class_indexes <- sapply(class_values, function(x) which(class_column == x))
split_data <- lapply(class_indexes, function(x) marketing[x,])
object <- lapply(split_data, function(x) gbm(formula, data = x))
rel.inf <- lapply(object, function(x) summary.gbm(x, plotit=FALSE))
nobjs <- length(class_values)
for( i in 1:nobjs ) {
tmp <- rel.inf[[i]]
tmp.names <- row.names(tmp)
tmp <- tmp$rel.inf
names(tmp) <- tmp.names
barplot(tmp, horiz=TRUE, col='red',
xlab="Relative importance", main=paste0("Class = ", class_values[i]))
}
rel.inf
}
par(mfrow=c(1,2))
classPlots(Income ~ Marital + Age, data = marketing, column = 2)
Run Code Online (Sandbox Code Playgroud)
`
| 归档时间: |
|
| 查看次数: |
10384 次 |
| 最近记录: |