Bir*_*ird 6 merge r dataframe dplyr
我正在尝试比较多个向量,以查看它们之间存在匹配值的位置.我想将向量组合成一个表,其中每列具有相同的值(对于匹配)或NA(对于不匹配).
例如:
list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")
Run Code Online (Sandbox Code Playgroud)
应该成为:
a a a
b NA b
c c c
d d NA
NA NA e
NA NA f
Run Code Online (Sandbox Code Playgroud)
我试着做载体dataframes和使用merge
,join
从dplyr
,cbind
,cbind.fill
,但所有这些要么返回一个单一的列或不所有相匹配的行值.
使用R获得此结果的最佳方法是什么?
您可以使用unlist
和unique
获取所有可能的值,然后在每个向量中找到它们的匹配项.如果没有匹配,请按以下方式match
返回NA
:
list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")
list_of_lists <- list(
list1 = list1,
list2 = list2,
list3 = list3
)
all_values <- unique(unlist(list_of_lists))
fleshed_out <- vapply(
list_of_lists,
FUN.VALUE = all_values,
FUN = function(x) {
x[match(all_values, x)]
}
)
fleshed_out
# list1 list2 list3
# [1,] "a" "a" "a"
# [2,] "b" NA "b"
# [3,] "c" "c" "c"
# [4,] "d" "d" NA
# [5,] NA NA "e"
# [6,] NA NA "f"
Run Code Online (Sandbox Code Playgroud)
一个Base R
解决方案:
df1 = data.frame(col = list1, list1)
df2 = data.frame(col = list2, list2)
df3 = data.frame(col = list3, list3)
Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))
# col list1 list2 list3
# 1 a a a a
# 2 b b <NA> b
# 3 c c c c
# 4 d d d <NA>
# 5 e <NA> <NA> e
# 6 f <NA> <NA> f
Run Code Online (Sandbox Code Playgroud)
结果:
> Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))[,-1]
list1 list2 list3
1 a a a
2 b <NA> b
3 c c c
4 d d <NA>
5 <NA> <NA> e
6 <NA> <NA> f
Run Code Online (Sandbox Code Playgroud)
或者用dplyr
+ purrr
:
library(dplyr)
library(purrr)
list(list1, list2, list3) %>%
map(~ data.frame(col = ., ., stringsAsFactors = FALSE)) %>%
reduce(full_join, by = "col") %>%
select(-col) %>%
setNames(paste0("list", 1:3))
Run Code Online (Sandbox Code Playgroud)
数据:
list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
977 次 |
最近记录: |