求组内计数的平均值

Question

求组内计数的平均值

我有一个如下所示的数据框：

library(tidyverse)    
x <- tibble(
   batch = rep(c(1,2), each=10),
   exp_id = c(rep('a',3),rep('b',2),rep('c',5),rep('d',6),rep('e',4))
 )

Run Code Online (Sandbox Code Playgroud)

我可以运行下面的代码来获取每个的计数exp_id：

x %>% group_by(batch,exp_id) %>% 
  summarise(count=n())

Run Code Online (Sandbox Code Playgroud)

生成：

  batch exp_id count
  <dbl> <chr>  <dbl>
1     1 a          3
2     1 b          2
3     1 c          5
4     2 d          6
5     2 e          4

Run Code Online (Sandbox Code Playgroud)

生成这些计数平均值的一种非常丑陋的方法是：

x %>% group_by(batch,exp_id) %>% 
  summarise(count=n()) %>% 
  ungroup() %>% 
  group_by(batch) %>% 
  summarise(avg_exp = mean(count))

Run Code Online (Sandbox Code Playgroud)

生成：

  batch avg_exp
  <dbl>   <dbl>
1     1    3.33
2     2    5

Run Code Online (Sandbox Code Playgroud)

有没有更简洁和“整洁”的方式来生成这个？

Answer 1

r2e*_*ans 5

library(dplyr)
group_by(x, batch) %>%
  summarize(avg_exp = mean(table(exp_id)))
# # A tibble: 2 x 2
#   batch avg_exp
#   <dbl>   <dbl>
# 1     1    3.33
# 2     2    5

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年，11 月前
查看次数：	46 次
最近记录：	3 年，11 月前