根据外部价值有条件地应用管道步骤

Kon*_*rad 6 workflow conditional pipeline r dplyr

鉴于dplyr工作流程:

require(dplyr)                                      
mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    filter(grepl(x = model, pattern = "Merc")) %>% 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))
Run Code Online (Sandbox Code Playgroud)

我有兴趣filter根据价值有条件地申请applyFilter.

对于applyFilter <- 1使用"Merc"字符串过滤行,而不使用过滤器返回所有行.

applyFilter <- 1


mtcars %>%
  tibble::rownames_to_column(var = "model") %>%
  filter(model %in%
           if (applyFilter) {
             rownames(mtcars)[grepl(x = rownames(mtcars), pattern = "Merc")]
           } else
           {
             rownames(mtcars)
           }) %>%
  group_by(am) %>%
  summarise(meanMPG = mean(mpg))
Run Code Online (Sandbox Code Playgroud)

问题

建议的解决方案效率低,因为ifelse始终会评估调用; 更可取的方法只会评估filter步骤applyFilter <- 1.

尝试

低效的工作液会像她那样:

mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    # Only apply filter step if condition is met
    if (applyFilter) { 
        filter(grepl(x = model, pattern = "Merc"))
        }
    %>% 
    # Continue 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))
Run Code Online (Sandbox Code Playgroud)

当然,上面的语法是不正确的.这只是理想的工作流程应该如何看待的例证.


期望的答案

  • 我对创建一个临时对象不感兴趣; 工作流程应该类似于:

    startingObject
        %>%
        ...
        conditional filter
        ...
        final object
    
    Run Code Online (Sandbox Code Playgroud)
  • 理想情况下,我想找到解决方案,我可以控制是否filter正在评估呼叫

tal*_*lat 11

这种方法怎么样:

mtcars %>% 
    tibble::rownames_to_column(var = "model") %>% 
    filter(if(applyfilter== 1) grepl(x = model, pattern = "Merc") else TRUE) %>% 
    group_by(am) %>% 
    summarise(meanMPG = mean(mpg))
Run Code Online (Sandbox Code Playgroud)

grepl仅当applyfilter为1时才会评估此方法,否则filter只需回收a TRUE.


或者另一种选择是使用{}:

mtcars %>% 
  tibble::rownames_to_column(var = "model") %>% 
  {if(applyfilter == 1) filter(., grepl(x = model, pattern = "Merc")) else .} %>% 
  group_by(am) %>% 
  summarise(meanMPG = mean(mpg))
Run Code Online (Sandbox Code Playgroud)

显然有另一种可能的方法,你只需要打破管道,有条件地做过滤器,然后继续管道(我知道OP没有要求这个,只想给其他读者另一个例子)

mtcars %<>% 
  tibble::rownames_to_column(var = "model")

if(applyfilter == 1) mtcars %<>% filter(grepl(x = model, pattern = "Merc"))

mtcars %>% 
  group_by(am) %>% 
  summarise(meanMPG = mean(mpg))
Run Code Online (Sandbox Code Playgroud)