Art*_*Sbr 0 merge r data.table
我有以下两个表:
df <- data.table(id = c("01","02","03"), tariff = c("1A","1B","1A"), summer = c(0,0,1), expenditure = c(150,200,90))
id tariff summer expenditure
1: 01 1A 0 150
2: 02 1B 0 200
3: 03 1A 1 90
catalogue <- data.table(tariff = c("1A","1A","1A","1A","1B","1B","1B","1B"), summer = c(0,0,1,1,0,0,1,1),
lb_quant = c(0,50,0,80,0,80,0,100), ub_quant = c(50,Inf,80,Inf,80,Inf,100,Inf), case = letters[1:8])
tariff summer lb_quant ub_quant case
1: 1A 0 0 50 a
2: 1A 0 50 Inf b
3: 1A 1 0 80 c
4: 1A 1 80 Inf d
5: 1B 0 0 80 e
6: 1B 0 80 Inf f
7: 1B 1 0 100 g
8: 1B 1 100 Inf h
Run Code Online (Sandbox Code Playgroud)
我想合并df和catalogue通过tariff,summer和expenditure。但是,由于expenditure是数字,因此合并将无法直接进行。
我正在寻找一种向量化的方式来将两个表合并在一起,如果:
tariff并summer匹配catalogue$lb_quant < df$expenditure <= catalogue$ub_quant作为一个例子,我想匹配df[id == "01"]与第二行catalogue,因为tariff == "01"与summer == 0和expenditure瀑布内[50, inf)。因此分配case = b给df[id = "01"]。
实df数很大,我想避免使用循环。是否有矢量化的方法可以在R或Python中实现?
在这种情况下,您也可以使用非等价更新联接。
请参阅以下单行代码(增加了换行符以提高可读性)
df[ catalogue,
`:=`( lb_quant = i.lb_quant,
ub_quant= i.ub_quant,
case = i.case ),
on = .( tariff,
summer,
expenditure > lb_quant,
expenditure < ub_quant ) ][]
Run Code Online (Sandbox Code Playgroud)
输出
id tariff summer expenditure lb_quant ub_quant case
1: 01 1A 0 150 50 Inf b
2: 02 1B 0 200 80 Inf f
3: 03 1A 1 90 80 Inf d
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
41 次 |
| 最近记录: |