我party在R中使用包
我想从结果树的各个节点获得各种统计数据(平均值,中位数等),但我看不出如何做到这一点.例如
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
airct
plot(airct)
Run Code Online (Sandbox Code Playgroud)
导致具有4个终端节点的树.如何获得每个节点的平均空气质量?
我无法得到节点的哪个变量是空气质量.但是我在这里向您展示如何自定义树形图:
innerWeights <- function(node){
grid.circle(gp = gpar(fill = "White", col = 1))
mainlab <- node$psplit$variableName
label <- paste(mainlab,paste('prediction=',round(node$prediction,2) ,sep= ''),sep= '\n')
grid.text( label= label,gp = gpar(col='red'))
}
plot(airct, inner_panel = innerWeights)
Run Code Online (Sandbox Code Playgroud)

编辑以按节点获取统计信息
库(gridExtra)
innerWeights <- function(node){
dat <- round_any(node$criterion$statistic,0.01)
grid.table(t(dat))
}
plot(airct, inner_panel = innerWeights)
Run Code Online (Sandbox Code Playgroud)
