Eli*_*Eli 1 group-by r sum countif summarize
我似乎可以为我的问题找到合适的代码。我想根据不同的条件创建组并汇总(总和、计数或长度)其他列。
我试过 group_by 并用不同的条件进行总结,但还没有找到任何有效的方法。
我有一个类似于这样的表:
data <- data.frame(Name= c('Anna', 'Anna', 'Anna', 'Anna', 'Anna',
'Bella', 'Bella', 'Bella', 'Camilla', 'Camilla'),
Date= c('1.1.2021', '1.1.2021', '2.1.2021', '3.1.2021', '3.1.2021',
'1.1.2021', '5.1.2021', '5.1.2021', '7.1.2021', '8.1.2021'),
Item= c('Apple','Pear', 'Zucini','Apple', 'Broccoli',
'Apple','Pear','Apple','Apple', 'Tomato'),
Category= c('Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Vegetable',
'Fruit', 'Fruit', 'Fruit', 'Fruit', 'Vegetable'),
Weight_kg= c(0.2,0.3,0.5,0.4,1.1,
1,0.5,0.8,1.2,0.5)
)
Run Code Online (Sandbox Code Playgroud)
这将是我想要的输出:
desired_table <- data.frame(Name=c('Anna', 'Bella', 'Camilla'),
Shopping_days=c(3,2,2),
days_fruit=c(2,2,1),
days_vegetables=c(2,0,1),
Total_kg=c(2.5,2.3,1.7),
Fruit_kg=c(0.9,2.3,1.2),
Vegetables=c(1.6,0,0.5))
Run Code Online (Sandbox Code Playgroud)
我已经尝试了许多与此类似的代码变体,但显然不起作用:
data1 <- data %>%
group_by(Name) %>%
summarize(Shopping_days = length(unique(Date)),
days_fruit = length(unique(Date, Category='Fruit')),
days_vegetables = length(unique(Date, Category='Vegetables')),
Total_kg = sum(Weight_kg),
Fruit_kg = sum(Weight_kg, if Category=Fruit),
Vegetables_kg = sum(Weight_kg, if Category=Vegetables))
Run Code Online (Sandbox Code Playgroud)
任何帮助将非常感激。
使用group_by和summarise:
library(dplyr)
data %>%
group_by(Name) %>%
summarise(Shopping_days = n_distinct(Date),
days_fruit = n_distinct(Item[Category == 'Fruit']),
days_vegetables = n_distinct(Item[Category == 'Vegetable']),
Total_kg = sum(Weight_kg),
Fruit_kg = sum(Weight_kg[Category == 'Fruit']),
Vegetables_kg = sum(Weight_kg[Category == 'Vegetable']))
# Name Shopping_days days_fruit days_vegetables Total_kg Fruit_kg Vegetables_kg
# <chr> <int> <int> <int> <dbl> <dbl> <dbl>
#1 Anna 3 2 2 2.5 0.9 1.6
#2 Bella 2 2 0 2.3 2.3 0
#3 Camilla 2 1 1 1.7 1.2 0.5
Run Code Online (Sandbox Code Playgroud)