I am using summarise_at() to obtain the mean and standard error of multiple variables by group.
每个组的输出有 1 行,每个组的每个计算量有 1 列。我想要一个表格,每个变量有 1 行,每个计算量有 1 列:
data <- mtcars
data$condition <- as.factor(c(rep("control", 16), rep("treat", 16)))
data %>%
group_by(condition) %>%
summarise_at(vars(mpg, cyl, wt),
funs(mean = mean, se=sd(.)/sqrt(n())))
# A tibble: 2 x 7
condition mpg_mean cyl_mean wt_mean mpg_se cyl_se wt_se
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 control 18.2 6.5 3.56 1.04 0.387 0.204
2 treat 22.0 5.88 2.87 1.77 0.499 0.257
Run Code Online (Sandbox Code Playgroud)
以下是我认为更有用的内容(数字没有意义):
# MEAN.control, MEAN.treat, SE.control, SE.treat
# mpg 1.5 2.4 .30 .45
# cyl 3.2 1.9 .20 .60
# disp 12.3 17.8 .20 .19
Run Code Online (Sandbox Code Playgroud)
有任何想法吗?新手tidyverse,很抱歉,如果这太基本了。
该funs是越来越弃用dplyr。而是list在summarise_at/mutate_at. 在后summarise工序中,gather将数据转换成“长”格式,separate在定界符的“钥匙”列分为两个由分束_,然后unite将“COND”和“KEY2”(改变的“KEY2”的情况下)之后,spread它向“宽”格式,如果需要,使用列“key1”更改行名称
library(tidyverse)
data %>%
group_by(condition) %>%
summarise_at(vars(mpg, cyl, wt), list(MEAN = ~ mean(.),
SE = ~sd(.)/sqrt(n()))) %>%
gather(key, val, -condition) %>%
separate(key, into = c("key1", "key2")) %>%
unite(cond, key2, condition, sep=".") %>%
spread(cond, val) %>%
column_to_rownames('key1')
# MEAN.control MEAN.treat SE.control SE.treat
#cyl 6.500000 5.875000 0.3872983 0.4989572
#mpg 18.200000 21.981250 1.0369024 1.7720332
#wt 3.560875 2.873625 0.2044885 0.2571034
Run Code Online (Sandbox Code Playgroud)