我正在寻找一种简单的方法来旋转dplyr的tibble摘要.
说我做的是这样的,
# install.packages(c("dplyr"), dependencies = TRUE)
library(dplyr)
mtcars %>%
group_by(am) %>%
summarise(
n = n(),
Mean_disp = mean(disp),
Mean_hp = mean(hp),
Mean_qsec = mean(qsec),
Mean_drat = mean(drat)
)
#> # A tibble: 2 x 6
#> am n Mean_disp Mean_hp Mean_qsec Mean_drat
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 0 19 290.3789 160.2632 18.18316 3.286316
#> 2 1 13 143.5308 126.8462 17.36000 4.050000
Run Code Online (Sandbox Code Playgroud)
但是,我想要的是获得或多或少的输出,
#> # A tibble: 5 x 2
#> am <dbl> 0 1
#> n <int> 19 13
#> Mean_disp <dbl> 290.3789 143.5308
#> Mean_hp <dbl> 160.2631 126.8462
#> Mean_qsec <dbl> 18.183158 17.36000
#> Mean_drat <dbl> 3.286316 4.050000
Run Code Online (Sandbox Code Playgroud)
我意识到我可以使用t(),但是将tibble转换为列表并弄乱格式化.
也许聚集再传播:
library(dplyr)
library(tidyr)
mtcars %>%
group_by(am) %>%
summarise(
n = n(),
Mean_disp = mean(disp),
Mean_hp = mean(hp),
Mean_qsec = mean(qsec),
Mean_drat = mean(drat)) %>%
gather(key = key, value = value, -am) %>%
spread(key = am, value = value)
# # A tibble: 5 x 3
# key `0` `1`
# * <chr> <dbl> <dbl>
# 1 Mean_disp 290.378947 143.5308
# 2 Mean_drat 3.286316 4.0500
# 3 Mean_hp 160.263158 126.8462
# 4 Mean_qsec 18.183158 17.3600
# 5 n 19.000000 13.0000
Run Code Online (Sandbox Code Playgroud)
另一个选项,在group_by之前收集,然后获取所有选定列的均值,然后再次传播(但不确定如何添加):n()
mtcars %>%
select(am, disp, hp, qsec, drat) %>%
gather(key = key, value = value, -am) %>%
group_by(am, key) %>%
summarise(myMean = mean(value)) %>%
spread(key = am, value = myMean)
# # A tibble: 4 x 3
# key `0` `1`
# * <chr> <dbl> <dbl>
# 1 disp 290.378947 143.5308
# 2 drat 3.286316 4.0500
# 3 hp 160.263158 126.8462
# 4 qsec 18.183158 17.3600
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
532 次 |
| 最近记录: |