Res*_*ium 5 merge join r date range
我正在寻找一种简单的方法来按日期范围联接两个表。一个表包含确切的日期,另一个表包含两个标识时间段开始和结束的变量。如果第一个表中的日期与第二个表中的范围不符,则需要加入表。
data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
name = c('id1','id2','id3','id4'))
data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'),
ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
class = c(1,2,3,4))
result <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'),
ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
name = c('id1','id2','id3','id4'),
class = c(1,2,3,4))
Run Code Online (Sandbox Code Playgroud)
有什么帮助吗?我发现了一些困难的例子,但由于格式的原因,它们甚至无法处理我的数据。我需要类似的东西:
select * from data1
left join
select * from data2
where data2.beginning <= data1.date <= data2.ending
Run Code Online (Sandbox Code Playgroud)
谢谢
我知道以下内容看起来很可怕,但这是我想出的。最好使用“ sqldf”包(请参见下文)。
library(data.table)
data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
name = c('id1','id2','id3','id4'))
data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'),
ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
class = c(1,2,3,4))
result <- cbind(data1,"beginning"=sapply(1:nrow(data2),function(x) data2$beginning[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
"ending"=sapply(1:nrow(data2),function(x) data2$ending[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
"class"=sapply(1:nrow(data2),function(x) data2$class[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]))
Run Code Online (Sandbox Code Playgroud)
使用软件包sqldf:
library(sqldf)
result = sqldf("select * from data1
left join data2
on data1.date between data2.beginning and data2.ending")
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
4385 次 |
最近记录: |