dplyr过滤后跨组的行数

Fri*_*der 3 r dplyr tidyverse

我想要数据帧中每个组的计数和比例(所有元素)(过滤后).此代码生成所需的输出:

library(dplyr)
df <- data_frame(id = sample(letters[1:3], 100, replace = TRUE),
                 value = rnorm(100))

summary <- filter(df, value > 0) %>%
    group_by(id) %>%
    summarize(count = n()) %>%
    ungroup() %>%
    mutate(proportion = count / sum(count))

> summary
# A tibble: 3 x 3
     id count proportion
  <chr> <int>      <dbl>
1     a    17  0.3695652
2     b    13  0.2826087
3     c    16  0.3478261
Run Code Online (Sandbox Code Playgroud)

有一种优雅的解决方案,以避免ungroup()和第二summarize()步骤.就像是:

summary <- filter(df, value > 0) %>%
    group_by(id) %>%
    summarize(count = n(),
              proportion = n() / [?TOTAL_ROWS()?])
Run Code Online (Sandbox Code Playgroud)

我在文档中找不到这样的功能,但我必须遗漏一些明显的东西.谢谢!

Psi*_*dom 7

您可以使用nrow.它指的是整个数据帧中的管道:

df %>% 
    filter(value > 0) %>% 
    group_by(id) %>% 
    summarise(count = n(), proportion = count / nrow(.))

# A tibble: 3 x 3
#     id count proportion
#  <chr> <int>      <dbl>
#1     a    14  0.2592593
#2     b    22  0.4074074
#3     c    18  0.3333333
Run Code Online (Sandbox Code Playgroud)