我对R很新,我遇到了一个非常愚蠢的问题.
我正在使用rpart包校准回归树,以便进行一些分类和一些预测.
由于R,校准部件易于操作且易于控制.
#the package rpart is needed
library(rpart)
# Loading of a big data file used for calibration
my_data <- read.csv("my_file.csv", sep=",", header=TRUE)
# Regression tree calibration
tree <- rpart(Ratio ~ Attribute1 + Attribute2 + Attribute3 +
Attribute4 + Attribute5,
method="anova", data=my_data,
control=rpart.control(minsplit=100, cp=0.0001))
Run Code Online (Sandbox Code Playgroud)
在校准了一个大的决策树之后,我希望,对于给定的数据样本,找到一些新数据的相应聚类(以及预测值).
该predict功能似乎是完美的需要.
# read validation data
validationData <-read.csv("my_sample.csv", sep=",", header=TRUE)
# search for the probability in the tree
predict <- predict(tree, newdata=validationData, class="prob")
# dump them in a file
write.table(predict, file="dump.txt") …Run Code Online (Sandbox Code Playgroud)