将满足特定条件的所有行分组

Question

将满足特定条件的所有行分组

我有以下数据框df1：

  company_location count
  <chr>            <int>
1 DE                  28
2 JP                   6
3 GB                  47
4 HN                   1
5 US                 355
6 HU                   1

Run Code Online (Sandbox Code Playgroud)

我想去df2：

  company_location count
  <chr>            <int>
1 DE                  28
2 GB                  47
3 US                 355
4 OTHER                8

Run Code Online (Sandbox Code Playgroud)

df2与相同，df1但将所有列加在一起，count<10并将它们聚合在一行中，称为OTHER

是否存在这样的东西：一个 group_by() 函数，仅将与特定条件匹配的所有行分组为一组，并将所有其他行保留在仅包含它们的组中？

Answer 1

All*_*ron 5

这就是fct_lump_minfor - 它是来自的函数forcats，它是 tidyverse 的一部分。

library(tidyverse)

df %>%
  group_by(company_location = fct_lump_min(company_location, 10, count)) %>%
  summarise(count = sum(count))

#> # A tibble: 4 x 2
#>   company_location count
#>   <fct>            <int>
#> 1 DE                  28
#> 2 GB                  47
#> 3 US                 355
#> 4 Other                8

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年，7 月前
查看次数：	725 次
最近记录：	3 年，7 月前