使用data.table一般聚合列

Lau*_*šys 0 r data.table

我正在使用data.table包尝试一般的聚合数据.我有多个要汇总的列.我使用以下脚本创建初始数据表:

library(data.table)
dt <- data.table(x.1 = rnorm(10, 20, 3), x.2 = rnorm(10, 20, 3), x.3 = rnorm(10, 20, 3),
                 y.1 = rnorm(10, 20, 3), y.2 = rnorm(10, 20, 3), y.3 = rnorm(10, 20, 3),
                 z.1 = rnorm(10, 20, 3), z.2 = rnorm(10, 20, 3), z.3 = rnorm(10, 20, 3))
Run Code Online (Sandbox Code Playgroud)

我想要实现的是通过对每个组应用总和来聚合列{x1,x2,x3,y1,y2,y3,z1,z2,z3} => {x.total,y.total,z.total}列.

我可以使用for循环这样做:

prefixes <- c('x', 'y', 'z')
initial.colnames <- c(names(dt))

for (i in 1:nrow(dt)){
  for (pref in prefixes){
    dt[,eval(paste0(pref, '.total')) := sum(dt[i, eval(grep(pref, initial.colnames))]), with = TRUE]
  }
}
Run Code Online (Sandbox Code Playgroud)

但是,我想使用内联数据表构造来应用,如下所示:

dt[, eval(paste0(prefixes, '.total')) := sum(dt[,eval(grep(prefixes, initial.colnames))]), with = F]
Run Code Online (Sandbox Code Playgroud)

但这并没有给我所需的结果.

也许有一些想法我怎么能以正确的方式做到这一点?

Fra*_*ank 6

这是一种聚合方式melt:

mDT = melt(dt[, r := .I], measure.vars = patterns(prefixes), value.name=prefixes)

mDT[, lapply(.SD, sum), by=r, .SDcols=prefixes]


     r        x        y        z
 1:  1 63.65898 65.41892 56.40470
 2:  2 60.58634 62.71055 48.69771
 3:  3 50.12036 60.06289 66.38637
 4:  4 55.42629 63.38670 56.98914
 5:  5 59.94042 54.28727 49.20218
 6:  6 59.51313 67.53499 59.24097
 7:  7 63.26874 62.23262 60.70875
 8:  8 54.90082 76.09135 58.79787
 9:  9 56.35402 52.11372 60.37903
10: 10 52.77926 55.06044 53.75093
Run Code Online (Sandbox Code Playgroud)

  • 那里非常好 (3认同)

akr*_*run 5

我们可以使用MapReduce

dt[,paste0(prefixes, '.total'):= Map(function(i)  Reduce('+',as.list(.SD[,i, with=FALSE])), 
                                split(names(dt), sub('\\..*', '', names(dt))))]
Run Code Online (Sandbox Code Playgroud)