使用roll = TRUE和allow.cartesian = TRUE

Cor*_*one 5 join r cartesian-product data.table

什么是进行笛卡尔连接并使用前滚特征的最佳方法,但是将滚动特征应用于连接表中的每个替代系列,而不是整个系列.

最佳解释一个例子:

library(data.table)
A = data.table(x = c(1,2,3,4,5), y = letters[1:5])
B = data.table(x = c(1,2,3,1,4), f = c("Alice","Alice","Alice", "Bob","Bob"), z = 101:105)
setkey(B,x)
C = B[A, roll = TRUE, allow.cartesian=TRUE, rollends = FALSE]

A
B
C[f == "Alice"]
C[f == "Bob"]
C
Run Code Online (Sandbox Code Playgroud)

所以我们有两个起始表:

> A
   x y
1: 1 a
2: 2 b
3: 3 c
4: 4 d
5: 5 e
> B
   x     f   z
1: 1 Alice 101
2: 1   Bob 104
3: 2 Alice 102
4: 3 Alice 103
5: 4   Bob 105
Run Code Online (Sandbox Code Playgroud)

而且我想加入这些,这样我就可以为每个 x值而且A我有两个AliceBob行,如果有任何一个缺失(但没有滚动到结尾),则向前滚动.这不是很有效,因为我现在得到它:

> C[f == "Alice"]
   x     f   z y
1: 1 Alice 101 a
2: 2 Alice 102 b
3: 3 Alice 103 c
> C[f == "Bob"]
   x   f   z y
1: 1 Bob 104 a
2: 4 Bob 105 d
> C
   x     f   z y
1: 1 Alice 101 a
2: 1   Bob 104 a
3: 2 Alice 102 b
4: 3 Alice 103 c
5: 4   Bob 105 d
6: 5    NA  NA e
Run Code Online (Sandbox Code Playgroud)

因为Alice有2和3,所以它不会向前推送Bob的数据.我需要Bob的额外行,所以我想得到:

> C[f == "Alice"]
   x     f   z y
1: 1 Alice 101 a
2: 2 Alice 102 b
3: 3 Alice 103 c
> C[f == "Bob"]
   x   f   z y
1: 1 Bob 104 a
2: 2 Bob 104 b  # THESE ROWS ARE MISSING
3: 3 Bob 104 c  # THESE ROWS ARE MISSING
4: 4 Bob 105 d
> C
   x     f   z y
1: 1 Alice 101 a
2: 1   Bob 104 a
3: 2 Alice 102 b
4: 2   Bob 104 b  # THESE ROWS ARE MISSING
5: 3 Alice 103 c
6: 3   Bob 104 c  # THESE ROWS ARE MISSING
7: 4   Bob 105 d
8: 5    NA  NA e
Run Code Online (Sandbox Code Playgroud)

edd*_*ddi 4

干得好:

setkey(B, f, x)

setkey(B[CJ(unique(f), unique(x)), allow.cartesian = T,
         roll = T, rollends = c(F,F)], x)[A, allow.cartesian = T]
#   x     f   z y
#1: 1 Alice 101 a
#2: 1   Bob 104 a
#3: 2 Alice 102 b
#4: 2   Bob 104 b
#5: 3 Alice 103 c
#6: 3   Bob 104 c
#7: 4 Alice  NA d
#8: 4   Bob 105 d
#9: 5    NA  NA e
Run Code Online (Sandbox Code Playgroud)

您可以过滤掉NA' 以满足您的需要。