给出以下矩阵:
A B C
[1,] TRUE FALSE TRUE
[2,] FALSE TRUE TRUE
[3,] FALSE FALSE TRUE
[4,] FALSE TRUE TRUE
[5,] FALSE TRUE TRUE
[6,] TRUE TRUE TRUE
m <- structure(c(TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE,
FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE), .Dim = c(6L,
3L), .Dimnames = list(NULL, c("A", "B", "C")))
Run Code Online (Sandbox Code Playgroud)
我们如何可以提取每行TRUE值第一列有效?当然,我们可以使用apply每行然后得到min(which(...)).
这是所需的输出:
[1] A B C B B A
Run Code Online (Sandbox Code Playgroud)
这个帖子似乎与我的问题重复,但不是:
我们可以用 max.col
colnames(m)[max.col(m, "first")]
#[1] "A" "B" "C" "B" "B" "A"
Run Code Online (Sandbox Code Playgroud)
如果连续没有TRUE,那么我们可以将其更改为NA(如果需要)
colnames(m)[max.col(m, "first")*NA^!rowSums(m)]
Run Code Online (Sandbox Code Playgroud)
或者 ifelse
colnames(m)[ifelse(rowSums(m)==0, NA, max.col(m, "first"))]
Run Code Online (Sandbox Code Playgroud)
另一个愿景,which用于处理logical矩阵的类:
colnames(m)[aggregate(col~row, data=which(m, arr.ind = TRUE), FUN=min)$col]
#[1] "A" "B" "C" "B" "B" "A"
Run Code Online (Sandbox Code Playgroud)
我们得到TRUE值的索引,然后按行找到它们出现的最小(索引)列.
基准
library(microbenchmark)
n <- matrix(FALSE, nrow=1000, ncol=500) # couldn't afford a bigger one...
n <- t(apply(n, 1, function(rg) {rg[sample(1:500, 1, replace=TRUE)] <- TRUE ; rg}))
colnames(n) <- paste0("name", 1:500)
akrun <- function(n){colnames(n)[max.col(n, "first")]}
cath <- function(n){colnames(n)[aggregate(col~row, data=which(n, arr.ind = TRUE), FUN=min)$col]}
all(akrun(n)==cath(n))
#[1] TRUE
microbenchmark(akrun(n), cath(n))
# expr min lq mean median uq max neval cld
#akrun(n) 6.985716 7.233116 8.231404 7.525513 8.842927 31.23469 100 a
# cath(n) 18.416079 18.811473 19.586298 19.272398 20.262169 22.42786 100 b
Run Code Online (Sandbox Code Playgroud)