我有2个数据帧在R. Data1有2列id,date和Data2有3列id,date,level.I要设置level在塔Data1基础上level和date列Data2.
Data1 = data.frame(id = c(1,1,1), dates = c("2014-06","2016-02","2016-05"))
id date
1 2014-06
1 2016-02
1 2016-05
Data2 = data.frame(id = c(1,1,1), dates = c("2015-07","2016-04","2016-07"), level=c(3,4,5))
id date level
1 2015-07 3
1 2016-04 4
1 2016-07 5
Run Code Online (Sandbox Code Playgroud)
因此产生的数据框应该是:
id date level
1 2014-06 NULL
1 2016-02 3
1 2016-05 4
Run Code Online (Sandbox Code Playgroud)
您可以使用-package中的滚动连接data.table并将dates-columns转换为日期类来完成此操作(请参阅本文末尾的注释):
library(data.table)
setDT(Data1, key = c('id','dates'))
setDT(Data2, key = c('id','dates'))
Data1[Data2, lev := level, roll = -Inf, rollends = c(TRUE,FALSE)][]
Run Code Online (Sandbox Code Playgroud)
这使:
> Data1
id dates lev
1: 1 2014-06-01 NA
2: 1 2016-02-01 3
3: 1 2016-05-01 4
Run Code Online (Sandbox Code Playgroud)
说明:
setDT表,并将键设置为连接所需的列Data1用lev := level.随着roll = -Inf你向后滚动,rollends = c(TRUE,FALSE)你只需向后滚动第一个值.不需要事先设置键.你也可以这样做:
setDT(Data1)
setDT(Data2)
Data1[Data2, on = c('id','dates'), lev := level, roll = -Inf, rollends = c(TRUE,FALSE)][]
Run Code Online (Sandbox Code Playgroud)
使用数据:
Data1 = data.frame(id = c(1,1,1), dates = c("2014-06","2016-02","2016-05"))
Data2 = data.frame(id = c(1,1,1), dates = c("2015-07","2016-04","2016-07"), level=c(3,4,5))
Data1$dates <- as.Date(paste0(Data1$dates,'-01'))
Data2$dates <- as.Date(paste0(Data2$dates,'-01'))
Run Code Online (Sandbox Code Playgroud)
注意:我将dates-columns转换为日期格式,方法是将每个月的第一天添加到每个月.这是必要的,以便正确地执行指定的滚动连接.