如何使用矢量长度不一致的 purrr::map2()

Question

如何使用矢量长度不一致的 purrr::map2()

我想根据 4 个不同的列生成汇总统计信息。不同的汇总统计基于标签列（具有两个值）和不同的组列（组 1、2 和 3）计算。因此，您将获得 Label1*group1、Label1*group2 等的不同 tbl。

set.seed(123)
tbl <- tibble(
       label  = rep(c("Label1", "Label2"), 6),
       group1  = rep(c("a", "b", "c", "d"), 3),
       group2  = rep(c("x", "y","z"), 4),
       group3  = rep(c("1", "1", "2", "2", "3", "3"), 2),
       value1 = rnorm(12, 100, 10),
       value2 = rnorm(12, 50, 5)
)

tbl

Run Code Online (Sandbox Code Playgroud)

我制作了一个示例函数，我希望该函数使用两个向量作为.x和.y参数。

tmp_label <- c("Label1", "Label2") # .x
group <- c("group1", "group2", "group3") # .y

# .f
tmp_function <- function(Label, group) {

  tbl %>% 
    filter(label %in% tmp_label) %>% 
    group_by(group) %>% 
    summarise(mean = mean(value1),
              mean2  = mean(value2)) %>% 
    mutate(Label = tmp_label)

}

Run Code Online (Sandbox Code Playgroud)

因此，我认为使用purrr::map2()似乎合适的函数来获得不同的汇总统计数据。但是，它会产生一个错误，告诉我映射的向量必须具有一致的长度。因此，我的问题是 1) 是否可以将purrr函数用于不一致的向量长度，2) 如果没有，是否有另一种（最好是整洁的）方法来获得不同的汇总统计数据。产生的错误：

map2(.x = tmp_label, .y = group, .f = tmp_function)
Error: Mapped vectors must have consistent lengths:
* `.x` has length 2
* `.y` has length 3

Run Code Online (Sandbox Code Playgroud)

任何帮助将非常感激！

Answer 1

akr*_*run 5

我们可以将更改group_by为group_by_at以字符串作为输入。此外，根据描述，OP 对“tmp_label”、“group”向量的组合感兴趣。我们可以使用crossing来创建所有组合并将其传入map2

library(dplyr)
library(purrr)
library(tidyr)
tmp_function <- function(Label, group) {
  tbl %>% 
     filter(label %in% Label) %>%  # changed the tmp_label to  Label
     group_by_at(group) %>% 
      summarise(mean = mean(value1),
          mean2  = mean(value2)) %>% 
     mutate(Label = Label)
}

d1 <- crossing(tmp_label, group) 
map2(d1$tmp_label, d1$group, tmp_function)

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，11 月前
查看次数：	833 次
最近记录：	5 年，11 月前