R 中有没有办法使用另一列的值作为条件来对列中的所有项目求和?

Pau*_*rez 2 r dataframe dplyr

我有两个包含具有特定条件和数字索引值的字符串名称的数据框。我想要的是使用索引值作为参考来计算一个条件有多少个名称。

数据框很大,所以我只是举个例子。我想总结所有值 in NAMEfroma考虑到CONDITIONbetweenINDEX-MININDEX-MAXfrom b。在这里,重要的是要指定并非“a”中的所有名称都将在最终结果中被捕获或汇总。结果应该如图所示c

a <- data.frame(c(1,1,2,3,3,3),c("A","B","C","D","E","F"),c(100,500,233,74,2750,10043))
colnames(a) <- c("CONDITION","NAME","INDEX")
b <- data.frame(c(1,2,3,3),c(1,75,2700,9872),c(600,245,3500,10500))
colnames(b) <- c("CONDITION","INDEX-MIN","INDEX-MAX")
c <- data.frame(c(1,2,3,3),c(1,75,2700,9872),c(600,245,3500,10500),c(2,1,1,1),c("A, B","C", "E", "F"))
colnames(c) <- c("CONDITION","INDEX-MIN","INDEX-MAX","NAME-COUNT","NAME")
Run Code Online (Sandbox Code Playgroud)

akr*_*run 5

我们可以通过非对等加入来做到这一点 data.table

library(data.table)
setDT(a)[b, .(NAME_COUNT = .N, NAME = toString(NAME)),
   on = .(CONDITION,  INDEX >=`INDEX-MIN`, INDEX < `INDEX-MAX`), by  = .EACHI]
Run Code Online (Sandbox Code Playgroud)

-输出

    CONDITION INDEX INDEX NAME_COUNT NAME
1:         1     1   600          2 A, B
2:         2    75   245          1    C
3:         3  2700  3500          1    E
4:         3  9872 10500          1    F
Run Code Online (Sandbox Code Playgroud)