R中的“分组”因子观察

Say*_*ari 1 r dplyr

我有一个与此结构类似的数据框

Year <- c("2000", "2001", "2002" ,"2003", "2004", "2005" ,"2006", "2007", "2008", "2009", "2010", "2011" ,"2012", "2013", "2014", "2015")
Sales <- c(2000,4800,6700,5000,7000,8000,3070,2000,1800,7100,6600,5000,6000,4200,1200,5700)
salesDF <- data.frame(Year,Sales)
Run Code Online (Sandbox Code Playgroud)

Year列是一个因子变量。我想改变一个新列,该列在 Year 列中具有观察值,以 5 年为间隔分组。因此,最终,销售趋势是 5 年间隔的倍数。

我希望我的传说有间隔 "2000", "2005", "2010", "2015"

我该如何实现这一目标?

Ian*_*ell 6

这是使用cumsum和模数 ( %%)分组的简单方法:

salesDF %>% 
  mutate(Group = cumsum(as.numeric(as.character(salesDF$Year)) %% 5 == 0)) %>%
  group_by(Group) %>%
  summarize(Year = first(Year), Mean = mean(Sales), Sum = sum(Sales))
# A tibble: 4 x 4
  Group Year   Mean   Sum
  <int> <fct> <dbl> <dbl>
1     1 2000   5100 25500
2     2 2005   4394 21970
3     3 2010   4600 23000
4     4 2015   5700  5700
Run Code Online (Sandbox Code Playgroud)

或者作为一个没有总结的新列:

salesDF %>% 
  mutate(Group = cumsum(as.numeric(as.character(salesDF$Year)) %% 5 == 0)) %>%
  group_by(Group) %>%
  mutate(Mean = mean(Sales), Sum = sum(Sales))
# A tibble: 16 x 5
# Groups:   Group [4]
   Year  Sales Group  Mean   Sum
   <fct> <dbl> <int> <dbl> <dbl>
 1 2000   2000     1  5100 25500
 2 2001   4800     1  5100 25500
 3 2002   6700     1  5100 25500
...
14 2013   4200     3  4600 23000
15 2014   1200     3  4600 23000
16 2015   5700     4  5700  5700
Run Code Online (Sandbox Code Playgroud)