dplyr 条件汇总函数

mic*_*hdn 4 r dplyr

我有这种情况,我需要根据条件使用不同的汇总函数。例如,使用鸢尾花,假设由于某种原因,如果物种是 setosa,我想要花瓣宽度的总和,否则我想要花瓣宽度的平均值。

天真地,我使用 case_when 写了这个,这不起作用:

iris <- tibble::as_tibble(iris)

 iris %>% 
  group_by(Species) %>% 
  summarise(pwz = case_when(
    Species == "setosa" ~ sum(Petal.Width, na.rm = TRUE),
    TRUE                ~ mean(Petal.Width, na.rm = TRUE)))
Run Code Online (Sandbox Code Playgroud)

summarise_impl(.data, dots) 中的错误:列的pwz长度必须为 1(汇总值),而不是 50

我最终找到了这样的东西,使用每种方法进行总结,然后在变异中选择我真正想要的方法:

iris %>% 
  group_by(Species) %>% 
  summarise(pws = sum(Petal.Width, na.rm = TRUE),
            pwm = mean(Petal.Width, na.rm = TRUE)) %>% 
  mutate(pwz = case_when(
    Species == "setosa" ~ pws,
    TRUE                ~ pwm)) %>% 
  select(-pws, -pwm)
Run Code Online (Sandbox Code Playgroud)

但是,创建所有这些汇总值并在最后只选择一个值似乎有点尴尬,尤其是当我的真实 case_when 复杂得多时。我不能在 summarise 中使用 case_when 吗?我的语法有错吗?任何帮助表示赞赏!

编辑:我想我应该指出我有多个条件/函数(假设我有,取决于变量,一些需要均值、总和、最大值、最小值或其他摘要)。

Ice*_*can 6

这很容易 data.table

library(data.table)
iris2 <- as.data.table(iris)

iris2[, if(Species == 'setosa') sum(Petal.Width) 
        else mean(Petal.Width)
      , by = Species]
Run Code Online (Sandbox Code Playgroud)

More concisely, but maybe not as clear

iris2[, ifelse(Species == 'setosa', sum, mean)(Petal.Width)
      , by = Species]
Run Code Online (Sandbox Code Playgroud)

With dplyr you can do

iris %>% 
  group_by(Species) %>% 
  summarise(pwz = if_else(first(Species == "setosa")
                          , sum(Petal.Width)
                          , mean(Petal.Width)))
Run Code Online (Sandbox Code Playgroud)

Note:

I'm thinking it probably makes more sense to "spread" your data with tidyr::spread so that each day has a column for temperature, rainfall, etc. Then you can use summarise in the usual way.


Dav*_*otz 1

如果你想将所有内容都放在汇总函数中,你总是可以这样做。但它并不比你原来的解决方法简单:

iris %>% 
  group_by(Species) %>% 
  summarise(pwz = 
    sum(Petal.Width, na.rm = TRUE)*
    (1/n()*mean(Species != "setosa") + 
     mean(Species == "setosa")))
Run Code Online (Sandbox Code Playgroud)