我有两个数据帧,每个数据帧每个ID有多行.我需要根据第一个数据帧的ID和日期从第二个数据帧返回最接近的日期和相关数据 - 将相关数据添加到第一个数据帧.这也必须与NA第二个数据帧中的s一起使用.示例数据:
set.seed(42)
df1 <- data.frame(ID = sample(1:3, 10, rep=T), dateTarget=(strptime((paste
(sprintf("%02d", sample(1:30,10, rep=T)), sprintf("%02d", sample(1:12,10, rep=T)),
(sprintf("%02d", sample(2013:2015,10, rep=T))), sep="")),"%d%m%Y")), Value=sample(15:100, 10, rep=T))
df2 <- data.frame(ID = sample(1:3, 10, rep=T), dateTarget=(strptime((paste
(sprintf("%02d", sample(1:30,20, rep=T)), sprintf("%02d", sample(1:12,20, rep=T)),
(sprintf("%02d", sample(2013:2015,20, rep=T))), sep="")),"%d%m%Y")), ValueMatch=sample(15:100, 20, rep=T))
Run Code Online (Sandbox Code Playgroud)
从base优先的东西- split和apply家庭的混合物?
决赛桌看起来像这样:
ID dateTarget Value dateMatch ValueMatch
1 3 22-02-15 52 09-03-15 94
2 1 29-12-14 18 06-12-14 88
3 3 08-12-15 98 06-07-15 48
4 2 14-01-13 52 08-04-13 77
5 2 29-07-15 97 01-08-15 68
6 3 30-05-13 91 01-04-13 85
7 1 04-11-13 70 21-02-14 35
8 2 15-06-15 98 01-08-15 68
9 3 17-11-14 68 15-12-14 95
Run Code Online (Sandbox Code Playgroud)
PS有没有更好的方法来生成随机日期(不使用seq.Date)?
Mar*_*pov 10
以下是基于基础包的解决方案:
z <- lapply(intersect(df1$ID,df2$ID),function(id) {
d1 <- subset(df1,ID==id)
d2 <- subset(df2,ID==id)
d1$indices <- sapply(d1$dateTarget,function(d) which.min(abs(d2$dateTarget - d)))
d2$indices <- 1:nrow(d2)
merge(d1,d2,by=c('ID','indices'))
})
z2 <- do.call(rbind,z)
z2$indices <- NULL
print(z2)
# ID dateTarget.x Value dateTarget.y ValueMatch
# 1 3 2015-11-14 47 2015-07-06 48
# 2 3 2015-12-08 98 2015-07-06 48
# 3 3 2015-02-22 52 2015-03-09 94
# 4 3 2014-11-17 68 2014-12-15 95
# 5 3 2013-05-30 91 2013-04-01 85
# 6 1 2013-11-04 70 2014-02-21 35
# 7 1 2014-12-29 18 2014-12-06 88
# 8 2 2013-01-14 52 2013-04-08 77
# 9 2 2015-07-29 97 2015-08-01 68
# 10 2 2015-06-15 98 2015-08-01 68
Run Code Online (Sandbox Code Playgroud)
使用data.table,简单而优雅的解决方案:
library(data.table)
setDT(df1)
setDT(df2)
setkey(df2, ID, dateTarget)[, dateMatch:=dateTarget]
df2[df1, roll='nearest']
ID dateTarget ValueMatch dateMatch Value
1: 3 2015-11-14 48 2015-07-06 47
2: 3 2015-02-22 94 2015-03-09 52
3: 1 2014-12-29 88 2014-12-06 18
4: 3 2015-12-08 48 2015-07-06 98
5: 2 2013-01-14 77 2013-04-08 52
6: 2 2015-07-29 68 2015-08-01 97
7: 3 2013-05-30 85 2013-04-01 91
8: 1 2013-11-04 35 2014-02-21 70
9: 2 2015-06-15 68 2015-08-01 98
10: 3 2014-11-17 95 2014-12-15 68
Run Code Online (Sandbox Code Playgroud)