相关疑难解决方法(0)

重叠连接开始和结束位置

考虑以下问题data.table.第一个定义了一组具有每个组'x'的起始位置和结束位置的区域:

library(data.table)

d1 <- data.table(x = letters[1:5], start = c(1,5,19,30, 7), end = c(3,11,22,39,25))
setkey(d1, x, start)

#    x start end
# 1: a     1   3
# 2: b     5  11
# 3: c    19  22
# 4: d    30  39
# 5: e     7  25
Run Code Online (Sandbox Code Playgroud)

第二个数据集具有相同的分组变量"x",并在每个组中定位"pos":

d2 <- data.table(x = letters[c(1,1,2,2,3:5)], pos = c(2,3,3,12,20,52,10))
setkey(d2, x, pos)

#    x pos
# 1: a   2
# 2: a   3
# 3: b   3
# 4: b  12
# …
Run Code Online (Sandbox Code Playgroud)

merge join r data.table

36
推荐指数
4
解决办法
6625
查看次数

如何使用data.table执行日期范围的连接?

如何使用data.table执行以下(直接使用sqldf)并得到完全相同的结果:

library(data.table)

whatWasMeasured <- data.table(start=as.POSIXct(seq(1, 1000, 100),
    origin="1970-01-01 00:00:00"),
    end=as.POSIXct(seq(10, 1000, 100), origin="1970-01-01 00:00:00"),
    x=1:10,
    y=letters[1:10])

measurments <- data.table(time=as.POSIXct(seq(1, 2000, 1),
    origin="1970-01-01 00:00:00"),
    temp=runif(2000, 10, 100))

## Alternative short names for data.tables
dt1 <- whatWasMeasured
dt2 <- measurments

## Straightforward with sqldf    
library(sqldf)

sqldf("select * from measurments m, whatWasMeasured wwm
where m.time between wwm.start and wwm.end")
Run Code Online (Sandbox Code Playgroud)

r time-series data.table

20
推荐指数
1
解决办法
4199
查看次数

标签 统计

data.table ×2

r ×2

join ×1

merge ×1

time-series ×1