我现在找不到副本.
我的问题如下:
我有两个data.tables.一个有两列(featurea,count),另一个有三列(featureb,featurec,count).我想要乘以(?),以便我有一个新data.table的所有可能性.诀窍是这些功能不匹配,因此merge解决方案可能无法解决问题.
MRE如下:
# two columns
DT1 <- data.table(featurea =c("type1","type2"), count = c(2,3))
# featurea count
#1: type1 2
#2: type2 3
#three columns
DT2 <- data.table(origin =c("house","park","park"), color =c("red","blue","red"),count =c(2,1,2))
# origin color count
#1: house red 2
#2: park blue 1
#3: park red 2
Run Code Online (Sandbox Code Playgroud)
在这种情况下,我的预期结果data.table如下:
> DT3
origin color featurea total
1: house red type1 4
2: house red type2 6
3: park blue type1 2
4: park blue type2 3
5: park red type1 4
6: park red type2 6
Run Code Online (Sandbox Code Playgroud)
请测试更大的数据,我不确定这是多么优化:
DT2[, .(featurea = DT1[["featurea"]],
count = count * DT1[["count"]]), by = .(origin, color)]
# origin color featurea count
#1: house red type1 4
#2: house red type2 6
#3: park blue type1 2
#4: park blue type2 3
#5: park red type1 4
#6: park red type2 6
Run Code Online (Sandbox Code Playgroud)
如果DT1组的数量较少,可能会更有效地切换它:
DT1[, c(DT2[, .(origin, color)],
.(count = count * DT2[["count"]])), by = featurea]
# featurea origin color count
#1: type1 house red 4
#2: type1 park blue 2
#3: type1 park red 4
#4: type2 house red 6
#5: type2 park blue 3
#6: type2 park red 6
Run Code Online (Sandbox Code Playgroud)
这将是一种方式.首先,我在扩展行DT2与expandRows()在splitstackshape包中.由于我指定,每行重复两次count = 2, count.is.col = FALSE.然后,我处理了乘法并创建了一个名为的新列total.与此同时,我为其创建了一个新专栏featurea.最后,我放弃了count.
library(data.table)
library(splitstackshape)
expandRows(DT2, count = nrow(DT1), count.is.col = FALSE)[,
`:=` (total = count * DT1[, count], featurea = DT1[, featurea])][, count := NULL]
Run Code Online (Sandbox Code Playgroud)
编辑
如果您不想添加其他包,可以在评论中尝试David的想法.
DT2[rep(1:.N, nrow(DT1))][,
`:=`(total = count * DT1$count, featurea = DT1$featurea, count = NULL)][]
# origin color total featurea
#1: house red 4 type1
#2: house red 6 type2
#3: park blue 2 type1
#4: park blue 3 type2
#5: park red 4 type1
#6: park red 6 type2
Run Code Online (Sandbox Code Playgroud)