R:为组和非组的组合创建第一个差异

Foo*_*Bar 2 r data.table

我有这个data.table有一些特定于组的数据,以及一些一般数据:

         group year      flow      agg
   1: 51557094 2010   3.46000 592649.6
   2: 51557133 1999 111.60000 522706.2
   3: 51557133 2000  29.36000 555279.7
   4: 51557133 2003  96.38000 592649.6
   5: 51557193 2004  65.22000 550622.4
Run Code Online (Sandbox Code Playgroud)

flow这里是group- year特异性的,aggyear具体的.我想计算第一个差异:对于flow基于group,和第一个差异year,并且agg没有分组,只是第一个差分year.

我更喜欢不包括的方法dplyr.

预期产出

         group year     dFlow      dAgg
   1: 51557094 2010        NA        NA
   2: 51557133 1999        NA        NA
   3: 51557133 2000    -82.24   32573.5
   4: 51557133 2003        NA        NA
   5: 51557193 2004        NA  -42027.2
Run Code Online (Sandbox Code Playgroud)

akr*_*run 5

你可以试试

 library(data.table)
 myDataTable[, ind:= 1:.N][order(year)][seq_len(.N) %in% 1:2, 
            dFlow:=c(NA, diff(flow)) , by = group][,
            dAgg:= c(NA, diff(agg)), cumsum(c(TRUE, diff(year)!=1))][
               order(ind)][,3:5 := NULL][]
  #      group year  dFlow     dAgg
  #1: 51557094 2010     NA       NA
  #2: 51557133 1999     NA       NA
  #3: 51557133 2000 -82.24  32573.5
  #4: 51557133 2003     NA       NA
  #5: 51557193 2004     NA -42027.2
Run Code Online (Sandbox Code Playgroud)

数据

df2 <- structure(list(group = c(51557094L, 51557133L, 51557133L, 
51557133L, 
51557193L), year = c(2010L, 1999L, 2000L, 2003L, 2004L),
flow = c(3.46, 
111.6, 29.36, 96.38, 65.22), agg = c(592649.6, 522706.2, 555279.7, 
592649.6, 550622.4)), .Names = c("group", "year", "flow", "agg"
), class = "data.frame", row.names = c("1:", "2:", "3:", "4:", 
"5:"))

myDataTable <- as.data.table(df2)
Run Code Online (Sandbox Code Playgroud)