Eli*_*eth 5 r matrix dataframe
我有矩阵
m <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE,dimnames = list(c("s1", "s2", "s3"),c("tom", "dick","bob")))
tom dick bob
s1 1 2 3
s2 4 5 6
s3 7 8 9
#and the data frame
current<-c("tom", "dick","harry","bob")
replacement<-c("x","y","z","b")
df<-data.frame(current,replacement)
current replacement
1 tom x
2 dick y
3 harry z
4 bob b
#I need to replace the existing names i.e. df$current with df$replacement if
#colnames(m) are equal to df$current thereby producing the following matrix
m <- matrix(1:9, nrow = 3, ncol = 3, byrow = TRUE,dimnames = list(c("s1", "s2", "s3"),c("x", "y","b")))
x y b
s1 1 2 3
s2 4 5 6
s3 7 8 9
Run Code Online (Sandbox Code Playgroud)
有什么建议?我应该使用'if'循环吗?谢谢.
您可以使用which
以匹配colnames
from m
中的值df$current
.然后,当您拥有索引时,可以对替换的列进行子集化df$replacement
.
colnames(m) = df$replacement[which(df$current %in% colnames(m))]
Run Code Online (Sandbox Code Playgroud)
在上面:
%in%
测试TRUE
或FALSE
比较被比较对象之间的任何匹配.which(df$current %in% colnames(m))
标识匹配名称的索引(在本例中为行号).df$replacement[...]
是列的子集的基本方法,df$replacement
只返回与步骤2匹配的行.查找索引的更直接的方法是使用match
:
> id <- match(colnames(m), df$current)
> id
[1] 1 2 4
> colnames(m) <- df$replacement[id]
> m
x y b
s1 1 2 3
s2 4 5 6
s3 7 8 9
Run Code Online (Sandbox Code Playgroud)
如下所述%in%
,通常使用起来更直观,并且效率差异是微不足道的,除非这些集合相对较大,例如
> n <- 50000 # size of full vector
> m <- 10000 # size of subset
> query <- paste("A", sort(sample(1:n, m)))
> names <- paste("A", 1:n)
> all.equal(which(names %in% query), match(query, names))
[1] TRUE
> library(rbenchmark)
> benchmark(which(names %in% query))
test replications elapsed relative user.self sys.self user.child sys.child
1 which(names %in% query) 100 0.267 1 0.268 0 0 0
> benchmark(match(query, names))
test replications elapsed relative user.self sys.self user.child sys.child
1 match(query, names) 100 0.172 1 0.172 0 0 0
Run Code Online (Sandbox Code Playgroud)