Mah*_*lid 3 r classification machine-learning r-caret
我想将训练数据分为70%训练,15%测试和15%验证。我正在使用createDataPartition()插入符号包的功能。我将其拆分如下
train <- read.csv("Train.csv")
test <- read.csv("Test.csv")
split=0.70
trainIndex <- createDataPartition(train$age, p=split, list=FALSE)
data_train <- train[ trainIndex,]
data_test <- train[-trainIndex,]
Run Code Online (Sandbox Code Playgroud)
是否有任何方法可以createDataPartition()像以下H2o方法一样分为训练,测试和验证?
data.hex <- h2o.importFile("Train.csv")
splits <- h2o.splitFrame(data.hex, c(0.7,0.15), destination_frames = c("train","valid","test"))
train.hex <- splits[[1]]
valid.hex <- splits[[2]]
test.hex <- splits[[3]]
Run Code Online (Sandbox Code Playgroud)
使用sample()基数R中的函数的方法是
splitSample <- sample(1:3, size=nrow(data.hex), prob=c(0.7,0.15,0.15), replace = TRUE)
train.hex <- data.hex[splitSample==1,]
valid.hex <- data.hex[splitSample==2,]
test.hex <- data.hex[splitSample==3,]
Run Code Online (Sandbox Code Playgroud)