case_when 与部分字符串匹配和 contains()

J.S*_*ree 2 string r contains stringr dplyr

我正在使用一个数据集,其中有许多名为 status1、status2 等的列。在这些列中,它表示某人是否豁免、完整、注册等。

不幸的是,豁免投入并不一致;这是一个示例:

library(dplyr)

problem <- tibble(person = c("Corey", "Sibley", "Justin", "Ruth"),
                  status1 = c("7EXEMPT", "Completed", "Completed", "Pending"),
                  status2 = c("exempt", "Completed", "Completed", "Pending"),
                  status3 = c("EXEMPTED", "Completed", "Completed", "ExempT - 14"))
Run Code Online (Sandbox Code Playgroud)

我正在尝试使用 case_when() 来创建一个具有最终状态的新列。如果它说已完成,那么它们就已完成。如果它说豁免但没有说完整,那么他们就豁免了。

重要的部分是我希望我的代码使用 contains("status") 或一些仅针对状态列且不需要全部键入的等效项,并且我希望它只需要部分字符串匹配即可豁免。

至于将 contains 与 case_when 一起使用,我看到了这个示例,但我无法将其应用到我的案例中:mutate with case_when and contains

这是我到目前为止尝试使用的,但正如你可以猜到的,它没有起作用:

library(purrr)
library(dplyr)
library(stringr)
solution <- problem %>%
  mutate(final= case_when(pmap_chr(select(., contains("status")), ~
    any(c(...) == str_detect(., "Exempt") ~ "Exclude",
               TRUE ~ "Complete"
  ))))
Run Code Online (Sandbox Code Playgroud)

这是我想要的最终产品的样子:

solution <- tibble(person = c("Corey", "Sibley", "Justin", "Ruth"),
                   status1 = c("7EXEMPT", "Completed", "Completed", "Pending"),
                   status2 = c("exempt", "Completed", "Completed", "Pending"),
                   status3 = c("EXEMPTED", "Completed", "Completed", "ExempT - 14"),
                   final = c("Exclude", "Completed", "Completed", "Exclude")) 
Run Code Online (Sandbox Code Playgroud)

谢谢你!

avi*_*seR 5

我认为你是在倒退。放在case_when里面pmap_chr而不是相反:

library(dplyr)
library(purrr)
library(stringr)

problem %>%
  mutate(final = pmap_chr(select(., contains("status")), 
                          ~ case_when(any(str_detect(c(...), "(?i)Exempt")) ~ "Exclude",
                                      TRUE ~ "Completed")))
Run Code Online (Sandbox Code Playgroud)

对于每次pmap迭代(problem数据集的每一行),我们要用来case_when检查是否存在字符串Exempt(?i)instr_detect使其不区分大小写。这和写作是一样的str_detect(c(...), regex("Exempt", ignore_case = TRUE))

输出:

# A tibble: 4 x 5
  person status1   status2   status3     final    
  <chr>  <chr>     <chr>     <chr>       <chr>    
1 Corey  7EXEMPT   exempt    EXEMPTED    Exclude  
2 Sibley Completed Completed Completed   Completed
3 Justin Completed Completed Completed   Completed
4 Ruth   Pending   Pending   ExempT - 14 Exclude
Run Code Online (Sandbox Code Playgroud)