有没有办法对每个组中的变量 str_detect 进行 group_by 并将结果存储在新列中？

Question

有没有办法对每个组中的变量 str_detect 进行 group_by 并将结果存储在新列中？

我想获取一个数据框、group_by一个变量，然后评估每个组以查看单独的变量是否包含该组中任何行中的字符串。

使用此信息，我想创建一个包含结果的新列。

即，如果组中至少一行包含该字符串，则组中每一行的新列中的值应为TRUE。如果组中没有行包含该字符串，则新列中的组值应为FALSE。

library(dplyr)
library(stringr)

df <- tibble(
    A=c('red','red','red','blue','blue','blue'),
    B=c('yes','no','no','no','no','no')
)

Run Code Online (Sandbox Code Playgroud)

例如，尝试检测 Column 中的字符串“yes” B，分别针对Columnred和blueColumn 组A

df %>%
    group_by(A) %>%
    mutate(yes_in_group = ifelse(str_detect(B, 'yes'), TRUE, FALSE))

Run Code Online (Sandbox Code Playgroud)

我希望看到的每个值都yes_in_group适用TRUE于该red组和FALSE，blue但mutate不尊重该组。

expected <- tibble(A=c('red','red','red','blue','blue','blue'),
                   B=c('yes','no','no','no','no','no'),
                   yes_in_group=c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE))

actual <- tibble(A=c('red','red','red','blue','blue','blue'),
                 B=c('yes','no','no','no','no','no'),
                 yes_in_group=c(TRUE, FALSE, FALSE, FALSE, FALSE, FALSE))

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mar*_*ius 6

您当前的使用ifelse不会执行任何操作：您获取的输出str_detect()（即TRUE/ FALSE），并将其转换为TRUE/ FALSE。要将结果扩展到整个组，您可以使用any：

library(dplyr)
library(stringr)

df %>%
    group_by(A) %>%
    mutate(yes_in_group = any(str_detect(B, 'yes')))

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，7 月前
查看次数：	3121 次
最近记录：	6 年，7 月前