使用 pmap 迭代 tibble 的行

Ano*_*n R 5 r dplyr purrr rowwise

我有一个非常简单的 tibble,我想迭代它的行以使用 function 来应用函数pmap。我想我可能误解了pmap函数上的一些观点,但我大多很难选择参数。所以我想知道rowwise在这种情况下我是否应该使用函数pmap。不过我还没见过案例。另一个问题是选择要使用列表或select函数进行迭代的变量:

# Here is my tibble
# Imagine I would like to apply a `n_distinct` function with pmap on it every rows

df <-  tibble(id = c("01", "02", "03","04","05","06"),
                  A = c("Jan", "Mar", "Jan","Jan","Jan","Mar"),
                  B = c("Feb", "Mar", "Jan","Jan","Mar","Mar"),
                  C = c("Feb", "Mar", "Feb","Jan","Feb","Feb")
)

# It is perfectly achievable with `rowwise` and `mutate` and results in my desired output

df %>%
  rowwise() %>%
  mutate(overal = n_distinct(c_across(A:C)))

# A tibble: 6 x 5
# Rowwise: 
  id    A     B     C     overal
  <chr> <chr> <chr> <chr>  <int>
1 01    Jan   Feb   Feb        2
2 02    Mar   Mar   Mar        1
3 03    Jan   Jan   Feb        2
4 04    Jan   Jan   Jan        1
5 05    Jan   Mar   Feb        3
6 06    Mar   Mar   Feb        2

# But with `pmap` it won't. 


df %>%
  select(-id) %>%
  mutate(overal = pmap_dbl(list(A, B, C), n_distinct))


# A tibble: 6 x 4
  A     B     C     overal
  <chr> <chr> <chr>  <dbl>
1 Jan   Feb   Feb        1
2 Mar   Mar   Mar        1
3 Jan   Jan   Feb        1
4 Jan   Jan   Jan        1
5 Jan   Mar   Feb        1
6 Mar   Mar   Feb        1

Run Code Online (Sandbox Code Playgroud)

我只需要对 tibbles 上的 rowwise iteration 的应用进行一些解释pmap,所以我非常感谢您提前提供的帮助,谢谢。

mni*_*ist 5

我能够找到这个问题,但不能说这是一个错误还是一个功能。重点是n_distinct()insidepmap将给定的输入作为具有 3 列的数据框进行处理。当应用于n_distinct()数据框时,它会计算不同行的数量,因此每行为 1

n_distinct(tibble(a = c(1, 2, 2),
                  b = 3))
#> [1] 2
Run Code Online (Sandbox Code Playgroud)

技巧是首先将输入转换为向量,然后将其传递给 n_distinct

df %>%
  select(-id) %>%
  mutate(overal = pmap_dbl(list(A, B, C), ~ n_distinct(c(...))))
#> # A tibble: 6 x 4
#>   A     B     C     overal
#>   <chr> <chr> <chr>  <dbl>
#> 1 Jan   Feb   Feb        2
#> 2 Mar   Mar   Mar        1
#> 3 Jan   Jan   Feb        2
#> 4 Jan   Jan   Jan        1
#> 5 Jan   Mar   Feb        3
#> 6 Mar   Mar   Feb        2
Run Code Online (Sandbox Code Playgroud)

  • [手册](https://purrr.tidyverse.org/reference/map2.html) 回答了这个问题。*请注意,数据框是一个非常重要的特殊情况,在这种情况下,pmap() 和 pwalk() 将函数 .f 应用于每一行。* (2认同)