我有两个包含具有特定条件和数字索引值的字符串名称的数据框。我想要的是使用索引值作为参考来计算一个条件有多少个名称。
数据框很大,所以我只是举个例子。我想总结所有值 in NAME
froma
考虑到CONDITION
betweenINDEX-MIN
和INDEX-MAX
from b
。在这里,重要的是要指定并非“a”中的所有名称都将在最终结果中被捕获或汇总。结果应该如图所示c
a <- data.frame(c(1,1,2,3,3,3),c("A","B","C","D","E","F"),c(100,500,233,74,2750,10043))
colnames(a) <- c("CONDITION","NAME","INDEX")
b <- data.frame(c(1,2,3,3),c(1,75,2700,9872),c(600,245,3500,10500))
colnames(b) <- c("CONDITION","INDEX-MIN","INDEX-MAX")
c <- data.frame(c(1,2,3,3),c(1,75,2700,9872),c(600,245,3500,10500),c(2,1,1,1),c("A, B","C", "E", "F"))
colnames(c) <- c("CONDITION","INDEX-MIN","INDEX-MAX","NAME-COUNT","NAME")
Run Code Online (Sandbox Code Playgroud)
我们可以通过非对等加入来做到这一点 data.table
library(data.table)
setDT(a)[b, .(NAME_COUNT = .N, NAME = toString(NAME)),
on = .(CONDITION, INDEX >=`INDEX-MIN`, INDEX < `INDEX-MAX`), by = .EACHI]
Run Code Online (Sandbox Code Playgroud)
-输出
CONDITION INDEX INDEX NAME_COUNT NAME
1: 1 1 600 2 A, B
2: 2 75 245 1 C
3: 3 2700 3500 1 E
4: 3 9872 10500 1 F
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
43 次 |
最近记录: |