如何使用 `values_fn = {summary_fun}` 来汇总 dplyr 中的重复项

Par*_*gue 8 r dplyr

当我使用 dplyrpivot_wider且没有唯一标识的行时,出现以下错误:

Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates
Run Code Online (Sandbox Code Playgroud)

我对重复项的摘要感兴趣,但是当我设置values_fn = summary_funor时values_fun = {summary_fun}它会抛出错误。我实际上如何按照values_fn参数预期的方式总结重复项?

GGA*_*son 6

常用的R汇总函数{summary_fun}有:mean、max、min、median、sum、prod。一些 dplyr 函数也可以工作:第一个、最后一个。

您还可以在values_fn 中使用匿名函数以获得更大的灵活性。参见示例:

   warpbreaks %>% 
   pivot_wider(names_from = wool, 
               values_from = breaks,
               values_fn = function(x) paste(x, collapse=","))
Run Code Online (Sandbox Code Playgroud)

或者总结一下测试内容:

  warpbreaks %>% 
  pivot_wider(names_from = wool, 
              values_from = breaks,
              values_fn =  function(x) paste("Any over 40:", any(x>40)))

# A tibble: 3 x 3
  tension A                  B                 
  <fct>   <chr>              <chr>             
1 L       Any over 40: TRUE  Any over 40: TRUE 
2 M       Any over 40: FALSE Any over 40: TRUE 
3 H       Any over 40: TRUE  Any over 40: FALSE
Run Code Online (Sandbox Code Playgroud)