如何使用 `values_fn = {summary_fun}` 来汇总 dplyr 中的重复项

Question

如何使用 `values_fn = {summary_fun}` 来汇总 dplyr 中的重复项

当我使用 dplyrpivot_wider且没有唯一标识的行时，出现以下错误：

Values are not uniquely identified; output will contain list-cols.
* Use `values_fn = list` to suppress this warning.
* Use `values_fn = length` to identify where the duplicates arise
* Use `values_fn = {summary_fun}` to summarise duplicates

Run Code Online (Sandbox Code Playgroud)

我对重复项的摘要感兴趣，但是当我设置values_fn = summary_funor时values_fun = {summary_fun}它会抛出错误。我实际上如何按照values_fn参数预期的方式总结重复项？

Answer 1

GGA*_*son 6

常用的R汇总函数{summary_fun}有：mean、max、min、median、sum、prod。一些 dplyr 函数也可以工作：第一个、最后一个。

您还可以在values_fn 中使用匿名函数以获得更大的灵活性。参见示例：

   warpbreaks %>% 
   pivot_wider(names_from = wool, 
               values_from = breaks,
               values_fn = function(x) paste(x, collapse=","))

Run Code Online (Sandbox Code Playgroud)

或者总结一下测试内容：

  warpbreaks %>% 
  pivot_wider(names_from = wool, 
              values_from = breaks,
              values_fn =  function(x) paste("Any over 40:", any(x>40)))

# A tibble: 3 x 3
  tension A                  B                 
  <fct>   <chr>              <chr>             
1 L       Any over 40: TRUE  Any over 40: TRUE 
2 M       Any over 40: FALSE Any over 40: TRUE 
3 H       Any over 40: TRUE  Any over 40: FALSE

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，3 月前
查看次数：	6442 次
最近记录：	3 年，8 月前