我有一个数据框,并希望对其执行一些特定的操作.
dat <- data.frame(Name = LETTERS[1:3],
Val1 = rnorm(3),
Val2 = rnorm(3))
# > dat
# Name Val1 Val2
# 1 A -1.055050 0.4499766
# 2 B 0.414994 -0.5999369
# 3 C -1.311374 -0.3967634
Run Code Online (Sandbox Code Playgroud)
我想做以下事情:
AB1 <- dat[dat$Name == "A", "Val1"] / dat[dat$Name == "B", "Val1"]
AC1 <- dat[dat$Name == "A", "Val1"] / dat[dat$Name == "C", "Val1"]
BC1 <- dat[dat$Name == "B", "Val1"] / dat[dat$Name == "C", "Val1"]
Run Code Online (Sandbox Code Playgroud)
AB2 <- dat[dat$Name == "A", "Val2"] / dat[dat$Name == "B", "Val2"]
AC2 <- dat[dat$Name == "A", "Val2"] / dat[dat$Name == "C", "Val2"]
BC2 <- dat[dat$Name == "B", "Val2"] / dat[dat$Name == "C", "Val2"]
Run Code Online (Sandbox Code Playgroud)
AB3 <- AB1 - AB2
AC3 <- AC1 - AC2
BC3 <- BC1 - BC2
Run Code Online (Sandbox Code Playgroud)
上面的工作正常,但我想以更智能和可扩展的方式实现这一点(例如,更多名称和Vals),以及将输出存储在data.frame中,以更容易以编程方式提取值.
最后,对于以下数据,可以使用更好的解决方案
dat2 <- data.frame(Region = rep(LETTERS[24:26], each=3),
Name = rep(LETTERS[1:3], 3),
Val1 = rep(rnorm(3), 3),
Val2 = rep(rnorm(3), 3))
> dat2
# Region Name Val1 Val2
# 1 X A 2.1098629 0.5779044
# 2 X B 0.5937334 0.1410554
# 3 X C 0.2819461 -1.1769578
# 4 Y A 2.1098629 0.5779044
# 5 Y B 0.5937334 0.1410554
# 6 Y C 0.2819461 -1.1769578
# 7 Z A 2.1098629 0.5779044
# 8 Z B 0.5937334 0.1410554
# 9 Z C 0.2819461 -1.1769578
Run Code Online (Sandbox Code Playgroud)
如果操作与上面的操作相同但按Region分组,那么输出就像是
> output
# Region AB3 AC3 BC3
# 1 X ? ? ?
# 2 Y ? ? ?
# 3 Z ? ? ?
Run Code Online (Sandbox Code Playgroud)
其中?是实际结果.
combn 这里是一匹工作马,可用于生成独特的成对组合:
combn(as.character(dat$Name), 2, simplify=FALSE)
#[[1]]
#[1] "A" "B"
#
#[[2]]
#[1] "A" "C"
#
#[[3]]
#[1] "B" "C"
Run Code Online (Sandbox Code Playgroud)
您还可以将这些成对组合的结果传递给函数,然后:
# set.seed(1)
##for reproducibility
combn(
as.character(dat$Name),
2,
FUN=function(x) do.call(`-`, dat[dat$Name == x[1], -1] / dat[dat$Name == x[2], -1])
)
#[1] -8.2526585 2.6940335 0.1818427
AB3
#[1] -8.252659
AC3
#[1] 2.694033
BC3
#[1] 0.1818427
Run Code Online (Sandbox Code Playgroud)