我使用data.table以下方法进行左非等连接:
OUTPUT <- DT2[DT1, on=.(DOB, FORENAME, SURNAME, POSTCODE, START_DATE <= MONTH, EXPIRY_DATE >= MONTH)]
Run Code Online (Sandbox Code Playgroud)
该OUTPUT包含正确的左连接,与该异常MONTH列(这是目前在DT1)的缺失.
这是一个错误data.table吗?
注:当然,START_DATE,EXPIRY_DATE和MONTH在同一个YYYY-MM-DD,IDATE格式.基于这些非等标准,连接的结果是正确的.只是缺少该列,我需要在进一步的工作中使用它.
编辑1:简化的可重复示例
DT1 <- structure(list(ID = c(1, 2, 3), FORENAME = c("JOHN", "JACK",
"ROB"), SURNAME = c("JOHNSON", "JACKSON", "ROBINSON"), MONTH = structure(c(16953L,
16953L, 16953L), class = c("IDate", "Date"))), .Names = c("ID",
"FORENAME", "SURNAME", "MONTH"), row.names = c(NA, -3L), class = c("data.table",
"data.frame"))
DT2 <- structure(list(CERT_NUMBER = 999, FORENAME = …Run Code Online (Sandbox Code Playgroud) 我有两个数据表:
library(data.table)
d1 <- data.table(grp = c("a", "c", "b", "a"), val = c(2, 3, 6, 7), y1 = 1:4, y2 = 5:8)
d2 <- data.table(grp = rep(c("a", "b", "c"), 2),
from = rep(c(1, 5), each = 3), to = rep(c(4, 10), each = 3), z = 11:16)
Run Code Online (Sandbox Code Playgroud)
我执行一个非等联接,其中'd1'中的'val'值应该落在每个组'grp'的'from'和'to'''''定义的范围内.
d1[d2, on = .(grp, val >= from, val <= to), nomatch = 0]
# grp val y1 y2 val.1 z
# 1: a 1 1 5 4 11
# 2: c 1 …Run Code Online (Sandbox Code Playgroud) 加入数据表:
X <- data.table(A = 1:4, B = c(1,1,1,1))
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1
Y <- data.table(A = 4)
# A
# 1: 4
Run Code Online (Sandbox Code Playgroud)
通过
X[Y, on = .(A == A)]
# A B
# 1: 4 1
Run Code Online (Sandbox Code Playgroud)
返回预期结果.但是,我希望这条线:
X[Y, on = .(A < A)]
# A B
# 1: 4 1
# 2: 4 1
# 3: 4 1
Run Code Online (Sandbox Code Playgroud)
回来
A B
1: 1 1 …Run Code Online (Sandbox Code Playgroud)