提取值出现在多列中的任何一列的行

Par*_*gue 5 r dplyr data.table

假设我有两个 data.frames

name_df = read.table(text = "player_name
a
b
c
d
e
f
g", header = T)

game_df = read.table(text = "game_id winner_name loser_name
1 a b
2 b a
3 a c
4 a d
5 b c
6 c d
7 d e
8 e f
9 f a
10 g f
11 g a
12 f e
13 a d", header = T)
Run Code Online (Sandbox Code Playgroud)

name_df包含 中所有winner_nameloser_name值的唯一列表game_df。我想为一行中的每个人创建一个新的 data.frame,name_df如果给定的名称(例如a)出现在winner_nameloser_name列中

所以我基本上是要合并game_df使用name_df,但关键列(name)可以出现在两种winner_nameloser_name

所以,对于刚刚ab最终的输出会看起来像:

final_df = read.table(text = "player_name game_id winner_name loser_name
a 1 a b
a 2 b a
a 3 a c
a 4 a d
a 9 f a
a 11 g a
a 13 a d
b 1 a b
b 2 b a
b 5 b c", header = T)
Run Code Online (Sandbox Code Playgroud)

akr*_*run 5

我们可以为 'player_name' 遍历 'name_df' 中的元素,filter为 'winner_name' 或 'loser_name' 遍历 'game_df' 中的行

library(dplyr)
library(purrr)
map_dfr(setNames(name_df$player_name, name_df$player_name), 
   ~ game_df %>%
        filter(winner_name %in% .x|loser_name %in% .x), .id = 'player_name')
Run Code Online (Sandbox Code Playgroud)

或者,如果有很多列,请使用 if_any

map_dfr(setNames(name_df$player_name, name_df$player_name), 
  ~ {
     nm1 <- .x
     game_df %>%
       filter(if_any(c(winner_name, loser_name), ~ . %in%  nm1))
      }, .id = 'player_name')
Run Code Online (Sandbox Code Playgroud)

  • @Parseltongue“.x”是来自player_name的每个值。它是匿名/lambda 函数(`~`)。你也可以写成 `function(x) game_df %&gt;% filter(winner_name %in% x|loser_name %in% x), .id = 'player_name')` (2认同)