R中检查一个字符串是否出现在另一个字符串中

Mal*_*hot 4 regex string compare r

我有一个包含这样的句子的小标题:

df <- tibble(sentences = c("Bob is looking for something", "Adriana has an umbrella", "Michael is looking at..."))
Run Code Online (Sandbox Code Playgroud)

另一个包含一长串名字:

names <- tibble(names = c("Bob", "Mary", "Michael", "John", "Etc."))
Run Code Online (Sandbox Code Playgroud)

我想查看句子是否包含列表中的名称,并添加一列来指示是否是这种情况,并获取以下 tibble :

wanted_df <- tibble(sentences = c("Bob is looking for something", "Adriana has an umbrella", "Michael is looking at..."), check = c(TRUE, FALSE, TRUE))
Run Code Online (Sandbox Code Playgroud)

到目前为止,我已经尝试过,但没有成功:

df <- df %>%
mutate(check = grepl(pattern = names$names, x = df$sentences, fixed = TRUE))
Run Code Online (Sandbox Code Playgroud)

并且 :

check <- str_detect(names$names %in% df$sentences)
Run Code Online (Sandbox Code Playgroud)

非常感谢您的帮助;)

Maë*_*aël 6

您应该在以下位置形成单个正则表达式grepl

\n
df %>% \n  mutate(check = grepl(paste(names$names, collapse = "|"), sentences))\n\n# A tibble: 3 \xc3\x97 2\n  sentences                    check\n  <chr>                        <lgl>\n1 Bob is looking for something TRUE \n2 Adriana has an umbrella      FALSE\n3 Michael is looking at...     TRUE \n
Run Code Online (Sandbox Code Playgroud)\n


Rui*_*das 5

这是一个基本的 R 解决方案。

\n
inx <- sapply(names$names, \\(pat) grepl(pat, df$sentences))\ninx\n#>        Bob  Mary Michael  John  Etc.\n#> [1,]  TRUE FALSE   FALSE FALSE FALSE\n#> [2,] FALSE FALSE   FALSE FALSE FALSE\n#> [3,] FALSE FALSE    TRUE FALSE FALSE\n\ninx <- rowSums(inx) > 0L\ndf$check <- inx\ndf\n#> # A tibble: 3 \xc3\x97 2\n#>   sentences                    check\n#>   <chr>                        <lgl>\n#> 1 Bob is looking for something TRUE \n#> 2 Adriana has an umbrella      FALSE\n#> 3 Michael is looking at...     TRUE\n
Run Code Online (Sandbox Code Playgroud)\n

创建于 2023-01-11,使用reprex v2.0.2

\n