如何将权重注入 dplyr 汇总名称-值对列表中?

Tem*_*Rex 5 parsing r dplyr tidyeval

我想编写一个通用weighted_summarise()函数,它将自动解析和转换用户调用的函数调用的形式:

data %>% weighted_summarise(weights, a = sum(b), c = mean(d))
Run Code Online (Sandbox Code Playgroud)

进入委托给的实际调用dplyr::summarise

data %>% dplyr::summarise(a = sum(weights * b), c = mean(weights * d))
Run Code Online (Sandbox Code Playgroud)

这里,ac是要在缩减数据中创建的新列, 和bdweights的现有列data

理想情况下,我希望我像调用“native”一样调用我的函数dplyr::summarise,但有一个额外的weights参数散布到每个聚合函数中。

weighted_summarise <- function(data, weights, ...) {
   data %>% dplyr::summarise(
       # how to manipulate the ... and inject the weights in each name-value pair?
   )
}
Run Code Online (Sandbox Code Playgroud)

问题如何操作省略号,以便weights将其注入到每个名称-值对的适当位置?我想以某种方式捕获 AST 并系统地遍历它并操作它。

akr*_*run 1

这是...通过将多个表达式转换为单个字符串并将其解析以求值来将“权重”插入到传入的表达式中的一种选项

\n
weighted_summarise <- function(data, weights, ...) {\n      weights <- rlang::as_string(rlang::ensym(weights))\n     \n     v1 <- purrr::map_chr(rlang::enexprs(...), \n   ~ stringr::str_replace(rlang::as_label(.x), "\\\\(",\n     function(x) stringr::str_c("(", weights, "*")))\n   eval(rlang::parse_expr(stringr::str_c("data %>% \n      summarise(", stringr::str_c(names(v1), v1, sep = "=", \n          collapse = ", "), ")")))\n   \n       }\n
Run Code Online (Sandbox Code Playgroud)\n

-测试

\n
> data %>%\n     weighted_summarise(weights, a = sum(b), c = mean(d))\n# A tibble: 1 \xc3\x97 2\n      a     c\n  <dbl> <dbl>\n1 -2.95  1.13\n\n# testing with the original summarise code outside the function\n> data %>% \n    dplyr::summarise(a = sum(weights * b), c = mean(weights * d))\n# A tibble: 1 \xc3\x97 2\n      a     c\n  <dbl> <dbl>\n1 -2.95  1.13\n
Run Code Online (Sandbox Code Playgroud)\n

数据

\n
data <- structure(list(b = c(-0.545880758366027, 0.536585304107612, 0.419623148618683, \n-0.583627199210279, 0.847460017311944, 0.266021979364892, 0.444585270360416, \n-0.466495123565759, -0.848370043948898, 0.00231194241576697), \n    d = c(-1.31690812429962, 0.598269112694685, -0.7622143703459, \n    -1.42909030324076, 0.332244449013422, -0.469060687608488, \n    -0.334986793584065, 1.53625215550584, 0.609994533253692, \n    0.51633569843567), weights = 1:10), class = c("tbl_df", "tbl", \n"data.frame"), row.names = c(NA, -10L))\n
Run Code Online (Sandbox Code Playgroud)\n