将多列从函数的参数传递给 group_by

Jun*_*tar 1 r tidyverse

考虑以下示例:

\n
library(tidyverse)\n\ndf <- tibble(\n  cat = rep(1:2, times = 4, each = 2),\n  loc = rep(c("a", "b"), each = 8),\n  value = rnorm(16)\n)\n\ndf %>% \n  group_by(cat, loc) %>% \n  summarise(mean = mean(value), .groups = "drop")\n\n# # A tibble: 4 x 3\n# cat loc     mean\n# * <int> <chr>  <dbl>\n# 1     1 a     -0.563\n# 2     1 b     -0.394\n# 3     2 a      0.159\n# 4     2 b      0.212\n
Run Code Online (Sandbox Code Playgroud)\n

我想为最后两行创建一个函数,它接受一个group参数将多个列传递给group_by.

\n

这是一个mean通过一组列计算值的虚拟函数作为示例:

\n
group_mean <- function(data, col_value, group) {\n  data %>% \n    group_by(across(all_of(group))) %>% \n    summarise(mean = mean({{col_value}}), .groups = "drop")\n}\n\ngroup_mean(df, value, c("cat", "loc"))\n\n# # A tibble: 4 x 3\n# cat loc     mean\n# * <int> <chr>  <dbl>\n# 1     1 a     -0.563\n# 2     1 b     -0.394\n# 3     2 a      0.159\n# 4     2 b      0.212\n
Run Code Online (Sandbox Code Playgroud)\n

该函数有效,但我更喜欢使用tidyselect/rlang方法来避免引用列名称,如下所示:

\n
group_mean(df, value, c(cat, loc))\n\n# Error: Problem adding computed columns in `group_by()`.\n# x Problem with `mutate()` input `..1`.\n# x object \'loc\' not found\n# \xe2\x84\xb9 Input `..1` is `across(all_of(c(cat, loc)))`.\n
Run Code Online (Sandbox Code Playgroud)\n

group起来{{}}适用于单列,但不适用于多列。我怎样才能做到这一点?

\n

akr*_*run 5

考虑使用...,然后我们可以选择在转换为symbol后使用带引号或不带引号的ensym

group_mean <- function(data, col_value, ...) {
   data %>% 
     group_by(!!! ensyms(...)) %>% 
     summarise(mean = mean({{col_value}}), .groups = "drop")
 }
Run Code Online (Sandbox Code Playgroud)

-测试

> group_mean(df, value, cat, loc)
# A tibble: 4 x 3
    cat loc     mean
  <int> <chr>  <dbl>
1     1 a      0.327
2     1 b     -0.291
3     2 a     -0.382
4     2 b     -0.320
> group_mean(df, value, 'cat', 'loc')
# A tibble: 4 x 3
    cat loc     mean
  <int> <chr>  <dbl>
1     1 a      0.327
2     1 b     -0.291
3     2 a     -0.382
4     2 b     -0.320
Run Code Online (Sandbox Code Playgroud)

如果我们已经用作...其他参数,那么一个选项是

group_mean <- function(data, col_value, group) {
  grp_lst <- as.list(substitute(group))
  if(length(grp_lst)> 1) grp_lst <- grp_lst[-1]
  grps <- purrr::map_chr(grp_lst, rlang::as_string)
  data %>% 
     group_by(across(all_of(grps))) %>% 
     summarise(mean = mean({{col_value}}), .groups = "drop")
}
Run Code Online (Sandbox Code Playgroud)

-测试

> group_mean(df, value, c(cat, loc))
# A tibble: 4 x 3
    cat loc     mean
  <int> <chr>  <dbl>
1     1 a      0.327
2     1 b     -0.291
3     2 a     -0.382
4     2 b     -0.320
Run Code Online (Sandbox Code Playgroud)