按组累计

geo*_*ory 6 r data.table

对于以下数据集:

d = data.frame(date = as.Date(as.Date('2015-01-01'):as.Date('2015-04-10'), origin = "1970-01-01"),
               group = rep(c('A','B','C','D'), 25), value = sample(1:100))
head(d)
         date group value
1: 2015-01-01     A     4
2: 2015-01-02     B    32
3: 2015-01-03     C    46
4: 2015-01-04     D    40
5: 2015-01-05     A    93
6: 2015-01-06     B    10
Run Code Online (Sandbox Code Playgroud)

..任何人都可以建议一个更优雅的方法来计算按组的累计值总数而不是这个data.table)方法?

library(data.table)
setDT(d)
d.cast = dcast.data.table(d, group ~ date, value.var = 'value', fun.aggregate = sum)
c.sum = d.cast[, as.list(cumsum(unlist(.SD))), by = group]
Run Code Online (Sandbox Code Playgroud)

..这是非常笨重的,产生一个需要dplyr::gatherreshape2::melt重新格式化的平面矩阵.

当然R可以比这更好吗?

MrF*_*ick 8

如果你只想要每组的累积总和,那么你可以做到

transform(d, new=ave(value,group,FUN=cumsum))
Run Code Online (Sandbox Code Playgroud)

带底座R.


Akh*_*air 5

这应该工作

library(dplyr)
d %>% 
  group_by(group) %>% 
  arrange(date) %>% 
  mutate(Total = cumsum(value))
Run Code Online (Sandbox Code Playgroud)