这个问题类似于一次在data.table中创建一堆滞后变量以及如何在每个组中创建滞后变量?,但据我所知,并不完全相同.
我想创造一些领先的变量,例如lead1,lead2和lead3下面,通过分组groups.
示例数据
require(data.table)
set.seed(1)
data <- data.table(time =c(1:10,1:8),groups = c(rep(c("a","b"),c(10,8))),
value = rnorm(18))
data
time groups value
1: 1 a -0.62645381
2: 2 a 0.18364332
3: 3 a -0.83562861
4: 4 a 1.59528080
5: 5 a 0.32950777
6: 6 a -0.82046838
7: 7 a 0.48742905
8: 8 a 0.73832471
9: 9 a 0.57578135
10: 10 a -0.30538839
11: 1 b 1.51178117
12: 2 b 0.38984324
13: 3 b -0.62124058
14: 4 b -2.21469989
15: 5 b 1.12493092
16: 6 b -0.04493361
17: 7 b -0.01619026
18: 8 b 0.94383621
Run Code Online (Sandbox Code Playgroud)
结果数据表应该是
time groups value lead1 lead2 lead3
1 1 a -0.62645381 0.18364332 -0.83562861 1.59528080
2 2 a 0.18364332 -0.83562861 1.59528080 0.32950777
3 3 a -0.83562861 1.59528080 0.32950777 -0.82046838
4 4 a 1.59528080 0.32950777 -0.82046838 0.48742905
5 5 a 0.32950777 -0.82046838 0.48742905 0.73832471
6 6 a -0.82046838 0.48742905 0.73832471 0.57578135
7 7 a 0.48742905 0.73832471 0.57578135 -0.30538839
8 8 a 0.73832471 0.57578135 -0.30538839 NA
9 9 a 0.57578135 -0.30538839 NA NA
10 10 a -0.30538839 NA NA NA
11 1 b 1.51178117 0.38984324 -0.62124058 -2.21469989
12 2 b 0.38984324 -0.62124058 -2.21469989 1.12493092
13 3 b -0.62124058 -2.21469989 1.12493092 -0.04493361
14 4 b -2.21469989 1.12493092 -0.04493361 -0.01619026
15 5 b 1.12493092 -0.04493361 -0.01619026 0.94383621
16 6 b -0.04493361 -0.01619026 0.94383621 NA
17 7 b -0.01619026 0.94383621 NA NA
18 8 b 0.94383621 NA NA NA
Run Code Online (Sandbox Code Playgroud)
请注意,我的实际数据集要大得多,我可能需要3个以上的主要变量.
我使用的是data.table1.9.4版,我不确定何时可以更新到最新版本,因此这个版本的解决方案将是一个奖励.对不起,这个额外的约束.
提前致谢.
标准data.table方法是使用内置shift函数(如链接线程中已提到的那样).你需要CRAN上最新的稳定版本 - v 1.9.6+
library(data.table) # V1.9.6+
data[, paste0("lead", 1L:3L) := shift(value, 1L:3L, type = "lead"), by = groups]
data
# time groups value lead1 lead2 lead3
# 1: 1 a -0.62645381 0.18364332 -0.83562861 1.59528080
# 2: 2 a 0.18364332 -0.83562861 1.59528080 0.32950777
# 3: 3 a -0.83562861 1.59528080 0.32950777 -0.82046838
# 4: 4 a 1.59528080 0.32950777 -0.82046838 0.48742905
# 5: 5 a 0.32950777 -0.82046838 0.48742905 0.73832471
# 6: 6 a -0.82046838 0.48742905 0.73832471 0.57578135
# 7: 7 a 0.48742905 0.73832471 0.57578135 -0.30538839
# 8: 8 a 0.73832471 0.57578135 -0.30538839 NA
# 9: 9 a 0.57578135 -0.30538839 NA NA
# 10: 10 a -0.30538839 NA NA NA
# 11: 1 b 1.51178117 0.38984324 -0.62124058 -2.21469989
# 12: 2 b 0.38984324 -0.62124058 -2.21469989 1.12493092
# 13: 3 b -0.62124058 -2.21469989 1.12493092 -0.04493361
# 14: 4 b -2.21469989 1.12493092 -0.04493361 -0.01619026
# 15: 5 b 1.12493092 -0.04493361 -0.01619026 0.94383621
# 16: 6 b -0.04493361 -0.01619026 0.94383621 NA
# 17: 7 b -0.01619026 0.94383621 NA NA
# 18: 8 b 0.94383621 NA NA NA
Run Code Online (Sandbox Code Playgroud)