无论列的顺序如何聚合

vir*_*.dz 4 r dataframe

我想将数据帧聚合两列,以便它们的变化只存在一次.值列应由聚合函数聚合,如max()sum()

数据:

itemID1  |itemID2  |value
---------|---------|-------
B0001    |B0001    |1
B0002    |B0001    |1
B0001    |B0002    |2
B0002    |B0002    |0
Run Code Online (Sandbox Code Playgroud)

结果可能是:

itemID1   |itemID2   |value
----------|----------|---------
B0001     |B0001     |1
B0001     |B0002     |3          #itemIDs could also be ordered in the other way
B0002     |B0002     |0
Run Code Online (Sandbox Code Playgroud)

到目前为止,我已经在SQL中实现它以通过库sqldf使用它,但是sqldf不支持WITH子句.

是否有可能直接在R中聚合这样的数据帧?

Rui*_*das 8

base R,但它复制数据,因为我在一个副本上保持原始完整.

dat2 <- dat
dat2[1:2] <- apply(dat2[1:2], 1, sort)
aggregate(value ~ itemID1 + itemID2, dat2, sum)
#  itemID1 itemID2 value
#1   B0001   B0001     1
#2   B0001   B0002     3
#3   B0002   B0002     0
Run Code Online (Sandbox Code Playgroud)

现在你可以rm(dat2)整理一下.

数据.

dat <-
structure(list(itemID1 = structure(c(1L, 2L, 1L, 2L), .Label = c("B0001", 
"B0002"), class = "factor"), itemID2 = structure(c(1L, 1L, 2L, 
2L), .Label = c("B0001", "B0002"), class = "factor"), value = c(1L, 
1L, 2L, 0L)), .Names = c("itemID1", "itemID2", "value"), class = "data.frame", row.names = c(NA, 
-4L))
Run Code Online (Sandbox Code Playgroud)