我完全预料到会因为重复的问题而受到批评,但我就是找不到类似的问题。提前致歉。
我正在尝试清理一些数据,这些数据有时包含摘要行,有时不包含。这是一个可重复的小例子:
library(tidyverse)
yr <- c(2010, 2010, 2010,
2011, 2011, 2011, 2011,
2012, 2012, 2012)
a <- c("HAY", "APPLES", "PUMPKINS",
"HAY", "HAY & HAYLAGE", "APPLES", "PUMPKINS",
"HAY & HAYLAGE", "APPLES", "PUMPKINS")
b <- c(1:10)
dat <- as_tibble(list(yr = yr, a = a, b = b))
dat %>%
group_by(yr) %>%
filter(a != "HAY" if group contains a== "HAY & HAYLAGE")
Run Code Online (Sandbox Code Playgroud)
显然,最后一行代码是伪代码。在 yr = 2011 的组中,我想过滤掉 a 等于“HAY”的行。我生成的 tibble 应该有 9 行。
这是一种方法——您可以if在过滤条件中使用语句:
library(dplyr)
# (data from OP)
dat <- dplyr::tibble(
yr = c(2010, 2010, 2010, 2011, 2011,
2011, 2011, 2012, 2012, 2012),
a = c("HAY", "APPLES", "PUMPKINS", "HAY", "HAY & HAYLAGE",
"APPLES", "PUMPKINS", "HAY & HAYLAGE", "APPLES", "PUMPKINS"),
b = 1:10
)
dat %>%
group_by(yr) %>%
filter(if ('HAY & HAYLAGE' %in% a) a!='HAY' else TRUE) %>%
ungroup()
## result will be:
##
## # A tibble: 9 x 3
## yr a b
## <dbl> <chr> <int>
## 1 2010 HAY 1
## 2 2010 APPLES 2
## 3 2010 PUMPKINS 3
## 4 2011 HAY & HAYLAGE 5
## 5 2011 APPLES 6
## 6 2011 PUMPKINS 7
## 7 2012 HAY & HAYLAGE 8
## 8 2012 APPLES 9
## 9 2012 PUMPKINS 10
Run Code Online (Sandbox Code Playgroud)