abi*_*hat 1 r dataframe dplyr tidyverse
我有一个带有分组变量的数据框,我想按组对它们求和.这很简单dplyr.
library(dplyr)
library(magrittr)
data <- data.frame(group = c("a", "a", "b", "c", "c"), n1 = 1:5, n2 = 2:6)
data %>% group_by(group) %>%
summarise_all(sum)
# A tibble: 3 x 3
group n1 n2
<fctr> <int> <int>
1 a 3 5
2 b 3 4
3 c 9 11
Run Code Online (Sandbox Code Playgroud)
但现在我想要一个新的列total,其中包含n1和n2按组的总和.像这样:
# A tibble: 3 x 3
group n1 n2 ttl
<fctr> <int> <int> <int>
1 a 3 5 8
2 b 3 4 7
3 c 9 11 20
Run Code Online (Sandbox Code Playgroud)
我怎么能这样做dplyr?
编辑: 实际上,这只是一个例子,我有很多变数.
我尝试了这两个代码,但它不是正确的维度......
data %>% group_by(group) %>%
summarise_all(sum) %>%
summarise_if(is.numeric, sum)
data %>% group_by(group) %>%
summarise_all(sum) %>%
mutate_if(is.numeric, .funs = sum)
Run Code Online (Sandbox Code Playgroud)
您可以使用mutate后summarize:
data %>%
group_by(group) %>%
summarise_all(sum) %>%
mutate(tt1 = n1 + n2)
# A tibble: 3 x 4
# group n1 n2 tt1
# <fctr> <int> <int> <int>
#1 a 3 5 8
#2 b 3 4 7
#3 c 9 11 20
Run Code Online (Sandbox Code Playgroud)
如果需要对所有数字列求和,可以使用rowSumswith select_if(选择数字列)对列进行求和:
data %>%
group_by(group) %>%
summarise_all(sum) %>%
mutate(tt1 = rowSums(select_if(., is.numeric)))
# A tibble: 3 x 4
# group n1 n2 tt1
# <fctr> <int> <int> <dbl>
#1 a 3 5 8
#2 b 3 4 7
#3 c 9 11 20
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5768 次 |
| 最近记录: |