Sam*_*amo 20 r time-series data.table
如何使用data.table执行以下(直接使用sqldf)并得到完全相同的结果:
library(data.table)
whatWasMeasured <- data.table(start=as.POSIXct(seq(1, 1000, 100),
origin="1970-01-01 00:00:00"),
end=as.POSIXct(seq(10, 1000, 100), origin="1970-01-01 00:00:00"),
x=1:10,
y=letters[1:10])
measurments <- data.table(time=as.POSIXct(seq(1, 2000, 1),
origin="1970-01-01 00:00:00"),
temp=runif(2000, 10, 100))
## Alternative short names for data.tables
dt1 <- whatWasMeasured
dt2 <- measurments
## Straightforward with sqldf
library(sqldf)
sqldf("select * from measurments m, whatWasMeasured wwm
where m.time between wwm.start and wwm.end")
Run Code Online (Sandbox Code Playgroud)
Aru*_*run 22
您可以使用foverlaps()有效地实现间隔连接的函数.在您的情况下,我们只需要一个虚拟列measurments.
注1:您应该安装data.table的开发版本 -
v1.9.5因为foverlaps()已经修复了一个bug .您可以在此处找到安装说明.注2:为方便起见,我会在这里打电话
whatWasMeasured=dt1和measurments=dt2.
require(data.table) ## 1.9.5+
dt2[, dummy := time]
setkey(dt1, start, end)
ans = foverlaps(dt2, dt1, by.x=c("time", "dummy"), nomatch=0L)[, dummy := NULL]
Run Code Online (Sandbox Code Playgroud)
有关?foverlaps详细信息,请参阅此帖子以进行性能比较.
| 归档时间: |
|
| 查看次数: |
4199 次 |
| 最近记录: |