Cha*_*lie 6 merge r list matrix
我有一个包含5个矩阵的列表,每个矩阵都有不同的大小,我想使用行名称合并所有矩阵.
这是我的列表的可重现的示例(我在R版本3.0.1上使用igraph_0.6.5-2):
x <- list(
as.matrix(c(1,4)),
as.matrix(c(3,19,11)),
as.matrix(c(3,9,8,5)),
as.matrix(c(3,10,8,87,38,92)),
as.matrix(c(87,8,8,87,38,92))
)
colnames(x[[1]]) <- c("P1")
colnames(x[[2]]) <- c("P2")
colnames(x[[3]]) <- c("P3")
colnames(x[[4]]) <- c("P4")
colnames(x[[5]]) <- c("P5")
rownames(x[[1]]) <- c("A","B")
rownames(x[[2]]) <- c("B","C","D")
rownames(x[[3]]) <- c("A","B", "E", "F")
rownames(x[[4]]) <- c("A","F","G","H","I","J" )
rownames(x[[5]]) <- c("B", "H","I","J", "K","L")
Run Code Online (Sandbox Code Playgroud)
这给了我以下列表:
> x
[[1]]
P1
A 1
B 4
[[2]]
P2
B 3
C 19
D 11
[[3]]
P3
A 3
B 9
E 8
F 5
[[4]]
P4
A 3
F 10
G 8
H 87
I 38
J 92
[[5]]
P5
B 87
H 8
I 8
J 87
K 38
L 92
Run Code Online (Sandbox Code Playgroud)
我想得到这样的东西:
> P1 P2 P3 P4 P5
A 1 na 3 3 na
B 4 3 9 na 87
C na 19 na na na
D na 11 na na na
E na na 8 na na
F na na 5 10 na
G na na na 8 na
H na na na 87 na
I na na na 38 8
J na na na 92 87
K na na na na 38
L na na na na 92
Run Code Online (Sandbox Code Playgroud)
使用do.call函数合并它们:
y <- do.call(merge,c(x, by="row.names",all=TRUE))
Run Code Online (Sandbox Code Playgroud)
给我以下错误:
Error in fix.by(by.x, x) : 'by' must match numbers of columns
Run Code Online (Sandbox Code Playgroud)
任何帮助是极大的赞赏.谢谢!
我会创建一个帮助函数来移动row.names()到a中的列data.frame,并使用Reduce()你的merge()所有data.frames list:
rownames2col <- function(inDF, RowName = ".rownames") {
temp <- data.frame(rownames(inDF), inDF, row.names = NULL)
names(temp)[1] <- RowName
temp
}
Reduce(function(x, y) merge(x, y, by = ".rownames", all = TRUE),
lapply(x, rownames2col))
# .rownames P1 P2 P3 P4 P5
# 1 A 1 NA 3 3 NA
# 2 B 4 3 9 NA 87
# 3 C NA 19 NA NA NA
# 4 D NA 11 NA NA NA
# 5 E NA NA 8 NA NA
# 6 F NA NA 5 10 NA
# 7 G NA NA NA 8 NA
# 8 H NA NA NA 87 8
# 9 I NA NA NA 38 8
# 10 J NA NA NA 92 87
# 11 K NA NA NA NA 38
# 12 L NA NA NA NA 92
Run Code Online (Sandbox Code Playgroud)
其原因使的加入步骤rownames()中为一列是通过合并row.names创建称为列Row.names在第一merge()的Reduce(),因此不允许后续list()项被方便地合并.
> Reduce(function(x, y) merge(x, y, by = "row.names", all = TRUE), x[1:2])
Row.names P1 P2
1 A 1 NA
2 B 4 3
3 C NA 19
4 D NA 11
Run Code Online (Sandbox Code Playgroud)
data.table方法data.table通过将keep.rownames参数设置为" TRUE"并将结果设置key为生成的" rn"列,可以使用非常相似的概念.
library(data.table)
Reduce(function(x, y) merge(x, y, all = TRUE),
lapply(x, function(y) data.table(y, keep.rownames=TRUE, key = "rn")))
# rn P1 P2 P3 P4 P5
# 1: A 1 NA 3 3 NA
# 2: B 4 3 9 NA 87
# 3: C NA 19 NA NA NA
# 4: D NA 11 NA NA NA
# 5: E NA NA 8 NA NA
# 6: F NA NA 5 10 NA
# 7: G NA NA NA 8 NA
# 8: H NA NA NA 87 8
# 9: I NA NA NA 38 8
# 10: J NA NA NA 92 87
# 11: K NA NA NA NA 38
# 12: L NA NA NA NA 92
Run Code Online (Sandbox Code Playgroud)
当然,手动方法由for循环辅助.这实际上可能比上面更快,因为merge与基本子集相比,它相当慢.速度方面的另一个优点是,您生成的对象是一个matrix并且许多matrix操作比data.frame操作更快.
## Identify the unique "rownames" for all list items
Rows <- unique(unlist(lapply(x, rownames)))
## Create a matrix of NA values
## with appropriate dimensions and dimnames
myMat <- matrix(NA, nrow = length(Rows), ncol = length(x),
dimnames = list(Rows, sapply(x, colnames)))
## Use your `for` loop to fill it in
## with the appropriate values from your list
for (i in seq_along(x)) {
myMat[rownames(x[[i]]), i] <- x[[i]]
}
myMat
# P1 P2 P3 P4 P5
# A 1 NA 3 3 NA
# B 4 3 9 NA 87
# C NA 19 NA NA NA
# D NA 11 NA NA NA
# E NA NA 8 NA NA
# F NA NA 5 10 NA
# G NA NA NA 8 NA
# H NA NA NA 87 8
# I NA NA NA 38 8
# J NA NA NA 92 87
# K NA NA NA NA 38
# L NA NA NA NA 92
Run Code Online (Sandbox Code Playgroud)