如何在R中根据时间和缓冲区间合并不等长的数据帧?

ltk*_*isp 5 time merge r intervals dataframe

我试图合并两个dataframes共同点的时间.但是,两者之间的时间记录可能不同.我希望按时间合并这两个,但缓冲间隔为30分钟.

dataframes概念性设置为这样:

Data_cam <- data.frame(Start_haul=c(("31-10-2015  07:13:00"),("31-10-2015  22:40:00"),("01-11-2015  06:48:00"),("01-11-2015  16:13:00")), 
              VesselID=c('XBBX','XBBX','XAAX','XAAX'),
              Species=("TOR"), Discard=c(0.28,0.96,2.92,0)) 

Data_sif <- data.frame(Start_haul=c(("31-10-2015  07:05:00"),("31-10-2015  07:05:00"),("31-10-2015  07:05:00"),("31-10-2015  23:05:00"),("31-10-2015  23:05:00"),("01-11-2015  06:28:00"),("01-11-2015  06:28:00"),("01-11-2015  06:28:00"),("01-11-2015  16:11:00")),             VesselID=c('XBBX','XBBX','XBBX','XBBX','XBBX','XAAX','XAAX','XAAX','XAAX'),Species=("TOR"), Size_class=c("1","2","3","4","5","1","2","4","5"),  Landing_kg=c(10.5,20.5,5.6,400,2,120,250,10.3,2.1))
Run Code Online (Sandbox Code Playgroud)

这意味着Data_sif中的三个第一行与Data_cam中的第一行匹配,我想将Data_cam中第一行的"Discard" - 值列添加到Data_sif中的第三行.同样,Data_sif中的第4行和第5行与Data_cam中的第二行匹配,我想在此处添加"Discard",依此类推所有行."Discard"列中的值应重复显示在"Size_class"列中显示的公共时间戳的每个值.

所需的输出看起来像这样

Data_combined <- data.frame(Start_haul=c(("31-10-2015  07:05:00"),("31-10-2015  07:05:00"),("31-10-2015  07:05:00"),("31-10-2015  23:05:00"),("31-10-2015  23:05:00"),("01-11-2015  06:28:00"),("01-11-2015  06:28:00"),("01-11-2015  06:28:00"),("01-11-2015  16:11:00")),             VesselID=c('XBBX','XBBX','XBBX','XBBX','XBBX','XAAX','XAAX','XAAX','XAAX'),Species=("TOR"), Size_class=c("1","2","3","4","5","1","2","4","5"),  Landing_kg=c(10.5,20.5,5.6,400,2,120,250,10.3,2.1), 
Discard=c(0.28,0.28,0.28,0.96,0.96,2.92,2.92,2.92,0))
Run Code Online (Sandbox Code Playgroud)

我想在最终实现中添加更多列,包括位置数据,但为了简单起见,我想从合并Discard-column开始.

我已经尝试过旧帖子但是没能为我的数据实现它.

Val*_*Val 1

这是一个带有lubridate和 的解决方案dplyr。这有点繁琐,但它有效:

library(lubridate)
library(dplyr)


Data_cam <- data.frame(Start_haul=c(("31-10-2015  07:13:00"),("31-10-2015  22:40:00"),("01-11-2015  06:48:00"),("01-11-2015  16:13:00")), 
                       VesselID=c('XBBX','XBBX','XAAX','XAAX'),
                       Species=("TOR"), Discard=c(0.28,0.96,2.92,0)) 

Data_sif <- data.frame(Start_haul=c(("31-10-2015  07:05:00"),("31-10-2015  07:05:00"),("31-10-2015  07:05:00"),("31-10-2015  23:05:00"),("31-10-2015  23:05:00"),("01-11-2015  06:28:00"),("01-11-2015  06:28:00"),("01-11-2015  06:28:00"),("01-11-2015  16:11:00")),
                   VesselID=c('XBBX','XBBX','XBBX','XBBX','XBBX','XAAX','XAAX','XAAX','XAAX'),Species=("TOR"), Size_class=c("1","2","3","4","5","1","2","4","5"),
                   Landing_kg=c(10.5,20.5,5.6,400,2,120,250,10.3,2.1))



Data_sif %>%left_join(., Data_cam, by = "VesselID",suffix=c('_sif','_cam')) %>%   mutate(buff1 = dmy_hms(Start_haul_cam) - minutes(30)) %>% 
  mutate(buff2 = dmy_hms(Start_haul_cam) + minutes(30)) %>% 
  filter(dmy_hms(Start_haul_sif) >= buff1 & dmy_hms(Start_haul_sif) <= buff2) %>% 
  select(-contains('_cam')) %>% select(-contains('buff'))


# Start_haul_sif VesselID Species_sif Size_class Landing_kg Discard
# 1 31-10-2015  07:05:00     XBBX         TOR          1       10.5    0.28
# 2 31-10-2015  07:05:00     XBBX         TOR          2       20.5    0.28
# 3 31-10-2015  07:05:00     XBBX         TOR          3        5.6    0.28
# 4 31-10-2015  23:05:00     XBBX         TOR          4      400.0    0.96
# 5 31-10-2015  23:05:00     XBBX         TOR          5        2.0    0.96
# 6 01-11-2015  06:28:00     XAAX         TOR          1      120.0    2.92
# 7 01-11-2015  06:28:00     XAAX         TOR          2      250.0    2.92
# 8 01-11-2015  06:28:00     XAAX         TOR          4       10.3    2.92
# 9 01-11-2015  16:11:00     XAAX         TOR          5        2.1    0.00
Run Code Online (Sandbox Code Playgroud)

编辑:

或者稍微瘦一点:

Data_sif %>%
  left_join(., Data_cam, by = "VesselID",suffix=c('_sif','_cam')) %>%
  filter(dmy_hms(Start_haul_sif) >= dmy_hms(Start_haul_cam) - minutes(30) & 
         dmy_hms(Start_haul_sif) <= dmy_hms(Start_haul_cam) + minutes(30)) %>% 
  select(-contains('_cam'))
Run Code Online (Sandbox Code Playgroud)