计算自上次观察以来满足条件的行数

TIm*_*aus 6 r dplyr

Data Frame看起来像这个例子的前三列:

id    obs   value   newCol
a     1     uncool  NA
a     2     cool    1
a     3     uncool  NA
a     4     uncool  NA
a     5     cool    2
a     6     uncool  NA
a     7     cool    1
a     8     uncool  NA
b     1     cool    0
Run Code Online (Sandbox Code Playgroud)

我需要的是一个列(上面的newCol),它计算值为"cool"的观察值或组的第一行(按id分组)之间的"uncool"数.

我该怎么做(通过dplyr理想使用)?

mar*_*kus 1

此外,id您还需要另一个分组变量,如下grp = cumsum(dat$value == "cool") - (dat$value == "cool")所示。

然后,您可以使用mutate我们在每个组内分配sum(value == "uncool")给观察值的位置value == "cool"NA其他位置。

library(dplyr)
dat %>%
  group_by(id, grp = cumsum(dat$value == "cool") - (dat$value == "cool")) %>% 
  mutate(newCool = if_else(value == "cool", sum(value == "uncool"), NA_integer_))
# A tibble: 9 x 6
# Groups:   id, grp [5]
  id      obs value  newCol   grp newCool
  <chr> <int> <chr>   <int> <int>   <int>
1 a         1 uncool     NA     0      NA
2 a         2 cool        1     0       1
3 a         3 uncool     NA     1      NA
4 a         4 uncool     NA     1      NA
5 a         5 cool        2     1       2
6 a         6 uncool     NA     2      NA
7 a         7 cool        1     2       1
8 a         8 uncool     NA     3      NA
9 b         1 cool        0     3       0
Run Code Online (Sandbox Code Playgroud)

数据

dat <- structure(list(id = c("a", "a", "a", "a", "a", "a", "a", "a", 
"b"), obs = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L), value = c("uncool", 
"cool", "uncool", "uncool", "cool", "uncool", "cool", "uncool", 
"cool"), newCol = c(NA, 1L, NA, NA, 2L, NA, 1L, NA, 0L)), .Names = c("id", 
"obs", "value", "newCol"), class = "data.frame", row.names = c(NA, 
-9L))
Run Code Online (Sandbox Code Playgroud)