I'm trying to save all the models from an h2o.automl as part of the h2o package. Currently I am able to save a single model using h2o.saveModel(aml@leader, path = "/home/data/user").
How can I save all the models?
Here is my attempt on a sample dataset:
library(h2o)
h2o.init()
prostate.hex <- h2o.importFile(path = paste("https://raw.github.com",
"h2oai/h2o-2/master/smalldata/logreg/prostate.csv", sep = "/"),
destination_frame = "prostate.hex")
Run Code Online (Sandbox Code Playgroud)
Get data from github or import via readr:
library(readr)
prostate <- read_csv("/home/data/user/prostate.csv")
prostate.hex<- as.h2o(prostate, "prostate.hex")
aml <- h2o.automl(y = …Run Code Online (Sandbox Code Playgroud) 我正在尝试将变量归入变量并按降序排序。
mydf
region airport value
MIA FLL 0.244587909
MIA PBI 0.824144687
MIA MIA 0.484907626
NYC EWR 0.731075565
NYC LGA 0.708648915
NYC HPN 0.523991258
LAX LGB 0.651847818
LAX LAX 0.423607479
LAX SNA 0.433837044
LAX ONT 0.723144957
Other MCO 0.657586674
Other SJC 0.084138321
Other OAK 0.698794154
Other BOS 0.85765002
Other BNA 0.018953126
Other WAS 0.234897245
Run Code Online (Sandbox Code Playgroud)
https://i.stack.imgur.com/G1E2k.jpg
我正在尝试复制上面的图。
这是第一次尝试:
ggplot(mydf, aes(x=airport,y=value, fill = region)) +
geom_bar(stat = "identity")
Run Code Online (Sandbox Code Playgroud)
这是第二次尝试:
ggplot(mydf, aes(x=reorder(airport,-value,sum),y=value, fill = region)) +
geom_bar(stat = "identity")
Run Code Online (Sandbox Code Playgroud)
我被困在这里。我可以嵌套重新排序吗?reorder(reorder(x, y), y)我不想不必手动进行每个分组。 …
我正在尝试对多个变量进行 t 检验。假设我想分组am,然后我想看看是否mpg有统计上的差异vs
这是一个旧答案,summarize_each但我正在尝试使用acrossdplyr 包中的内容。
library(tidyverse)
library(broom)
mtcars %>%
group_by(am) %>%
summarise_each(funs(
t.test(.[vs == 0], .[vs == 1])$p.value,
t.test(.[vs == 0], .[vs == 1])$conf.int[1],
t.test(.[vs == 0], .[vs == 1])$conf.int[2]
),
vars = mpg)
#> Warning: `summarise_each_()` was deprecated in dplyr 0.7.0.
#> Please use `across()` instead.
#> Warning: `funs()` was deprecated in dplyr 0.8.0.
#> Please use a list of either functions or lambdas:
#>
#> # Simple …Run Code Online (Sandbox Code Playgroud) 我试图总结这个数据集作为一个例子,我试图使用多个函数n()& mean()。如何将两者结合在同一个工作流程中?
这是一个反映我的较大数据的玩具数据集:
library(tidyverse)
df <- structure(list(group_var = c(70, 72, 73, 70, 70, 71, 70, 71,
71, 70), var1_scr = c(50.5, 25.75, 50.5, 50.5, 50.5, 50.5, 75.25,
75.25, 50.5, 75.25), var2_scr = c(50.5, 50.5, NA, 75.25, 50.5,
50.5, 75.25, 75.25, 100, 75.25), var3_scr = c(NA, NA, 75.25,
NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
df
#> # A tibble: 10 x 4
#> group_var var1_scr var2_scr var3_scr
#> …Run Code Online (Sandbox Code Playgroud) 遵循https://www.tidyverse.org/blog/2020/02/glue-strings-and-tidy-eval/ 中的示例
你如何将变量传递给这个函数?
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.6.2
#> Warning: package 'tidyr' was built under R version 3.6.2
mean_by <- function(data, by, var, prefix = "avg") {
data %>%
group_by({{ by }}) %>%
summarise("{prefix}_{{ var }}" := mean({{ var }}, na.rm = TRUE))
}
mean_by(mtcars, by = cyl, var = mpg, prefix = "avg")
#> # A tibble: 3 x 2
#> cyl avg_mpg
#> <dbl> <dbl>
#> 1 4 26.7
#> …Run Code Online (Sandbox Code Playgroud)