如何加入 dplyr 中除指定列之外的所有内容?

Alb*_*uez 4 r dplyr

我有两个共享所有列的数据集,我想基于除其中两列之外的所有列进行反连接。

\n

例如,我想做如下的事情:

\n
library(dplyr)\ndf1 <- tibble(x = c("A", "B", "C"), y = c("X", "Y", "Z"), z = c(1, 2, 3),\n              a = c(4, 5, 6))\n\ndf2 <- tibble(x = c("A", "D", "E"), y = c("X", "W", "R"), z = c(1, 5, 6),\n              a = c(4, 7, 8))\n\ndf2 %>% anti_join(df1, join_by(-c(z, a)))\n#> Error in `join_by()`:\n#> ! Expressions must use one of: `==`, `>=`, `>`, `<=`, `<`, `closest()`,\n#>   `between()`, `overlaps()`, or `within()`.\n#> \xe2\x84\xb9 Expression 1 is `-c(z, a)`.\n\n#> Backtrace:\n#>      \xe2\x96\x86\n#>   1. \xe2\x94\x9c\xe2\x94\x80df2 %>% anti_join(df1, join_by(-c(z, a)))\n#>   2. \xe2\x94\x9c\xe2\x94\x80dplyr::anti_join(., df1, join_by(-c(z, a)))\n#>   3. \xe2\x94\x9c\xe2\x94\x80dplyr:::anti_join.data.frame(., df1, join_by(-c(z, a)))\n#>   4. \xe2\x94\x82 \xe2\x94\x94\xe2\x94\x80dplyr:::join_filter(...)\n#>   5. \xe2\x94\x82   \xe2\x94\x94\xe2\x94\x80dplyr:::is_cross_by(by)\n#>   6. \xe2\x94\x82     \xe2\x94\x94\xe2\x94\x80rlang::is_character(x, n = 0L)\n#>   7. \xe2\x94\x94\xe2\x94\x80dplyr::join_by(-c(z, a))\n#>   8.   \xe2\x94\x94\xe2\x94\x80dplyr:::parse_join_by_expr(exprs[[i]], i, error_call = error_call)\n#>   9.     \xe2\x94\x94\xe2\x94\x80dplyr:::stop_invalid_top_expression(expr, i, error_call)\n#>  10.       \xe2\x94\x94\xe2\x94\x80rlang::abort(message, call = call)\n
Run Code Online (Sandbox Code Playgroud)\n

创建于 2023-03-27,使用reprex v2.0.2

\n

那么,是否有任何选项可以在连接中整理选择变量?或者,特别是,通过除某些变量之外的所有内容调用联接。

\n

zep*_*ryl 6

select()不需要的列 out ofdf2而不是尝试指定 in join_by()

\n
library(dplyr)\n\ndf2 %>%\n  anti_join(select(df1, -c(z, a)))\n\n# Joining with `by = join_by(x, y)`\n# # A tibble: 2 \xc3\x97 4\n#   x     y         z     a\n#   <chr> <chr> <dbl> <dbl>\n# 1 D     W         5     7\n# 2 E     R         6     8\n
Run Code Online (Sandbox Code Playgroud)\n

对于标准连接,如果您想丢弃df2$z和 ,请执行相同的操作$a。否则,请使用以下命令附加后缀rename_with()

\n
df2 %>%\n  full_join(\n    rename_with(df1, \\(x) paste0(x, ".df1"), c(z, a))\n  )\n\n# Joining with `by = join_by(x, y)`\n# # A tibble: 5 \xc3\x97 6\n#   x     y         z     a z.df1 a.df1\n#   <chr> <chr> <dbl> <dbl> <dbl> <dbl>\n# 1 A     X         1     4     1     4\n# 2 D     W         5     7    NA    NA\n# 3 E     R         6     8    NA    NA\n# 4 B     Y        NA    NA     2     5\n# 5 C     Z        NA    NA     3     6\n
Run Code Online (Sandbox Code Playgroud)\n