Data.table:使用组移位数据的操作

Bla*_*sad 1 r data.table

考虑以下内容data.table:

DT <- data.table(year    = c(2011,2012,2013,2011,2012,2013,2011,2012,2013),
                 level   = c(137,137,137,136,136,136,135,135,135),
                 valueIn = c(13,30,56,11,25,60,8,27,51))
Run Code Online (Sandbox Code Playgroud)

我想要以下输出:

DT <- data.table(year     = c(2011,2012,2013,2011,2012,2013,2011,2012,2013),
                 level    = c(137,137,137,136,136,136,135,135,135),
                 valueIn  = c(13,30,56, 11,25,60, 8,27,51),
                 valueOut = c(12,27.5,58, 9.5,26,55.5, NA,NA,NA))
Run Code Online (Sandbox Code Playgroud)

换句话说,我要计算操作(valueIn[level] - valueIn[level-1]) / 2,根据year.例如,第一个值的计算如下:(13+11)/2=12.

目前,我使用for循环执行此操作,其中我data.table为每个循环创建子集level:

levelDtList <- list()
levels <- sort(DT$level, decreasing = FALSE)
for (this.level in levels) {
  levelDt   <- DT[level == this.level]
  if (this.level == min(levels)) {
    valueOut <- NA
  } else {
    levelM1Data <- levelDtList[[this.level - 1]]
    valueOut <- (levelDt$valueIn + levelM1Data$valueIn) / 2
  }
  levelDt$valueOut <- valueOut
  levelDtList[[this.level]] <- levelDt
}
datatable <- rbindlist(levelDtList)
Run Code Online (Sandbox Code Playgroud)

这很丑陋而且很慢,所以我正在寻找一种更好,更快,更有效data.table的解决方案.

Jaa*_*aap 5

使用shift-function with type = 'lead'获取下一个值,求和除以2:

DT[, valueOut := (valueIn + shift(valueIn, type = 'lead'))/2, by = year]
Run Code Online (Sandbox Code Playgroud)

你得到:

   year level valueIn valueOut
1: 2011   137      13     12.0
2: 2012   137      30     27.5
3: 2013   137      56     58.0
4: 2011   136      11      9.5
5: 2012   136      25     26.0
6: 2013   136      60     55.5
7: 2011   135       8       NA
8: 2012   135      27       NA
9: 2013   135      51       NA
Run Code Online (Sandbox Code Playgroud)

使用shift-function指定的所有参数:

DT[, valueOut := (valueIn + shift(valueIn, n = 1L, fill = NA, type = 'lead'))/2, by = year]
Run Code Online (Sandbox Code Playgroud)