我有一个很大的字符矩阵,我想将它转换为一个字符串矩阵,但没有单独遍历每一行,所以我想知道有没有一种聪明的方法可以快速做到这一点,我尝试使用 paste(data[,4 :((i*2)+3)],collapse=""),但是我的问题是它将所有行组合成一个非常大的字符串,而我需要具有与原始矩阵相同的初始行数,每一行包含一列,它是包含该特定行中字符的字符串,换句话说:我想转换矩阵
a=
{
D E R P G K I
S K P A S L N
S K P A S L N
S K P A S L N
S K P A S L N
}
Run Code Online (Sandbox Code Playgroud)
进入
a=
{
DERPGKI
SKPASLN
SKPASLN
SKPASLN
SKPASLN
}
Run Code Online (Sandbox Code Playgroud)
apply是一个循环,但在这种情况下它应该仍然非常有效。它的用途是:
apply(x, 1, paste, collapse = "")
Run Code Online (Sandbox Code Playgroud)
或者,您可以尝试:
do.call(paste0, data.frame(x))
Run Code Online (Sandbox Code Playgroud)
这实际上可能更快......
一个可重复的例子(不知道为什么我在这里浪费时间)......
x <- structure(c("D", "S", "S", "S", "S", "E", "K", "K", "K", "K",
"R", "P", "P", "P", "P", "P", "A", "A", "A", "A",
"G", "S", "S", "S", "S", "K", "L", "L", "L", "L",
"I", "N", "N", "N", "N"), .Dim = c(5L, 7L))
x
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] "D" "E" "R" "P" "G" "K" "I"
# [2,] "S" "K" "P" "A" "S" "L" "N"
# [3,] "S" "K" "P" "A" "S" "L" "N"
# [4,] "S" "K" "P" "A" "S" "L" "N"
# [5,] "S" "K" "P" "A" "S" "L" "N"
Run Code Online (Sandbox Code Playgroud)
让我们比较一下选项:
library(microbenchmark)
fun1 <- function(inmat) apply(inmat, 1, paste, collapse = "")
fun2 <- function(inmat) do.call(paste0, data.frame(inmat))
fun1(x)
# [1] "DERPGKI" "SKPASLN" "SKPASLN" "SKPASLN" "SKPASLN"
fun2(x)
# [1] "DERPGKI" "SKPASLN" "SKPASLN" "SKPASLN" "SKPASLN"
microbenchmark(fun1(x), fun2(x))
# Unit: microseconds
# expr min lq median uq max neval
# fun1(x) 97.634 104.4805 112.0725 117.7735 268.503 100
# fun2(x) 1258.000 1282.6275 1301.5555 1316.5015 1576.506 100
Run Code Online (Sandbox Code Playgroud)
而且,在更长的数据上。
X <- do.call(rbind, replicate(100000, x, simplify=FALSE))
dim(X)
# [1] 500000 7
microbenchmark(fun1(X), fun2(X), times = 10)
# Unit: milliseconds
# expr min lq median uq max neval
# fun1(X) 4189.8940 4226.9354 4382.0403 4570.032 4596.983 10
# fun2(X) 825.9816 835.4351 888.5102 1031.509 1056.832 10
Run Code Online (Sandbox Code Playgroud)
我怀疑在更广泛的数据上,apply仍然会更有效率。