cumsum by group

Lov*_*ust 6 r cumsum

假设数据看起来像

group1 group2 num
A      sg     1
A      sh     2
A      sg     4
B      at     3
B      al     7
Run Code Online (Sandbox Code Playgroud)

a <- cumsum(data[,"num"]) # 1 3 7 10 17

我需要团体积累的东西.实际上,我有多列作为分组指标.我想通过我定义的子组获得累积的总和.

例如

如果我group1只分组,那么输出应该是

group1 sum
A      1
A      3
A      7
B      3
B      10
Run Code Online (Sandbox Code Playgroud)

如果我按两个变量分组,group1,group2则输出为

group1 group2 sum
A      sg     1
A      sh     2
A      sg     5
B      at     3
B      al     7
Run Code Online (Sandbox Code Playgroud)

Qui*_*ber 7

library(data.table)

data <- data.table(group1=c('A','A','A','B','B'),sum=c(1,2,4,3,7))

data[,list(cumsum = cumsum(sum)),by=list(group1)]
Run Code Online (Sandbox Code Playgroud)


Jas*_*gan 6

除了使用之外data.table,tapply在基础R中,对于这两种情况都可以正常工作:

dta <- read.table(text="
group1 group2 num
A      sg     1
A      sh     2
A      sg     4
B      at     3
B      al     7", header=TRUE)

dta$cumsum <- do.call(c, tapply(dta$num, dta$group1, FUN=cumsum))
Run Code Online (Sandbox Code Playgroud)

计算两组的累积和需要一些重新排序:

dta <- dta[order(dta$group1, dta$group2, dta$num),]

dta$cumsum2 <- do.call(c, tapply(dta$num, 
                                 paste0(dta$group1, dta$group2), 
                                 FUN=cumsum))
dta
      group1 group2 num cumsum cumsum2
1      A     sg   1      1       1
3      A     sg   4      7       5
2      A     sh   2      3       2
5      B     al   7     10       7
4      B     at   3      3       3
Run Code Online (Sandbox Code Playgroud)

如果您需要原始订单:

dta[as.numeric(rownames(dta)),]
  group1 group2 num cumsum cumsum2
1      A     sg   1      1       1
2      A     sh   2      3       2
3      A     sg   4      7       5
4      B     at   3      3       3
5      B     al   7     10       7
Run Code Online (Sandbox Code Playgroud)