Joh*_*nor 4 datetime group-by r sum date
我有一个随机事件(有时是不常见事件)的数据集,我想将其算作每周的总和。由于随机性,它们不是线性的,因此我迄今为止尝试过的其他示例不适用。
数据类似于这样:
df_date <- data.frame( Name = c("Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim",
"Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue"),
Dates = c("2010-1-1", "2010-1-2", "2010-01-5","2010-01-17","2010-01-20",
"2010-01-29","2010-02-6","2010-02-9","2010-02-16","2010-02-28",
"2010-1-1", "2010-1-2", "2010-01-5","2010-01-17","2010-01-20",
"2010-01-29","2010-02-6","2010-02-9","2010-02-16","2010-02-28"),
Event = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) )
Run Code Online (Sandbox Code Playgroud)
我想做的是创建一个新表,其中包含日历年中每周事件的总和。
在这种情况下产生这样的东西:
Name Week Events
Jim 1 3
Sue 1 3
Jim 2 0
Sue x ... x
and so on...
Run Code Online (Sandbox Code Playgroud)
多年来更新OP请求:
我们isoweek也可以使用 fromlubridate代替week
或者:
我们可以添加年份,如下所示:
df_date %>%
as_tibble() %>%
mutate(Week = week(ymd(Dates))) %>%
mutate(Year = year(ymd(Dates))) %>%
count(Name, Year, Week)
Run Code Online (Sandbox Code Playgroud)
lubridate使用sWeek函数将字符转换Dates为日期格式后,我们可以使用lubridatesymd函数。然后我们可以使用countwhich 的缩写 group_by(Name, Week) %>% summarise(Count = n())
:
library(dplyr)
library(lubridate)
df_date %>%
as_tibble() %>%
mutate(Week = week(ymd(Dates))) %>%
count(Name, Week)
Run Code Online (Sandbox Code Playgroud)
Name Week n
<chr> <dbl> <int>
1 Jim 1 3
2 Jim 3 2
3 Jim 5 1
4 Jim 6 2
5 Jim 7 1
6 Jim 9 1
7 Sue 1 3
8 Sue 3 2
9 Sue 5 1
10 Sue 6 2
11 Sue 7 1
12 Sue 9 1
Run Code Online (Sandbox Code Playgroud)