通过匹配值将向量连接到数据帧

Bir*_*ird 6 merge r dataframe dplyr

我正在尝试比较多个向量,以查看它们之间存在匹配值的位置.我想将向量组合成一个表,其中每列具有相同的值(对于匹配)或NA(对于不匹配).

例如:

list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")  
Run Code Online (Sandbox Code Playgroud)

应该成为:

a  a  a
b NA  b
c  c  c
d  d  NA
NA NA e
NA NA f
Run Code Online (Sandbox Code Playgroud)

我试着做载体dataframes和使用merge,joindplyr,cbind,cbind.fill,但所有这些要么返回一个单一的列或不所有相匹配的行值.

使用R获得此结果的最佳方法是什么?

Nat*_*rth 6

您可以使用unlistunique获取所有可能的值,然后在每个向量中找到它们的匹配项.如果没有匹配,请按以下方式match返回NA:

list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f")
list_of_lists <- list(
  list1 = list1,
  list2 = list2,
  list3 = list3
)

all_values <- unique(unlist(list_of_lists))

fleshed_out <- vapply(
  list_of_lists,
  FUN.VALUE = all_values,
  FUN       = function(x) {
    x[match(all_values, x)]
  }
)

fleshed_out
#    list1 list2 list3
# [1,] "a"   "a"   "a"
# [2,] "b"   NA    "b"
# [3,] "c"   "c"   "c"
# [4,] "d"   "d"   NA
# [5,] NA    NA    "e"
# [6,] NA    NA    "f"
Run Code Online (Sandbox Code Playgroud)


avi*_*seR 6

一个Base R解决方案:

df1 = data.frame(col = list1, list1)
df2 = data.frame(col = list2, list2)
df3 = data.frame(col = list3, list3)

Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))

#   col list1 list2 list3
# 1   a     a     a     a
# 2   b     b  <NA>     b
# 3   c     c     c     c
# 4   d     d     d  <NA>
# 5   e  <NA>  <NA>     e
# 6   f  <NA>  <NA>     f
Run Code Online (Sandbox Code Playgroud)

结果:

> Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))[,-1]
  list1 list2 list3
1     a     a     a
2     b  <NA>     b
3     c     c     c
4     d     d  <NA>
5  <NA>  <NA>     e
6  <NA>  <NA>     f
Run Code Online (Sandbox Code Playgroud)

或者用dplyr+ purrr:

library(dplyr)
library(purrr)

list(list1, list2, list3) %>%
  map(~ data.frame(col = ., ., stringsAsFactors = FALSE)) %>%
  reduce(full_join, by = "col") %>%
  select(-col) %>%
  setNames(paste0("list", 1:3))
Run Code Online (Sandbox Code Playgroud)

数据:

list1 <- c("a", "b", "c", "d")
list2 <- c("a", "c", "d")
list3 <- c("a", "b", "c", "e", "f") 
Run Code Online (Sandbox Code Playgroud)