我尝试仅过滤观察到的第一个 type=="y" 值。
df<-data.frame(id=c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,3,3),
type=c("x","x","y","x","x","x","x","y","y","x","x","x","x","y","x","y","x","x"))
Run Code Online (Sandbox Code Playgroud)
期望的输出:
id type
1 x
1 x
1 y
2 x
2 x
2 x
2 y
3 x
3 x
3 x
3 x
3 y
Run Code Online (Sandbox Code Playgroud)
我用代码尝试一下:
id type
1 x
1 x
1 y
2 x
2 x
2 x
2 y
3 x
3 x
3 x
3 x
3 y
Run Code Online (Sandbox Code Playgroud)
您可以在以下方面寻求帮助match并使用它slice:
library(dplyr)
df %>% group_by(id) %>% slice(1:match('y', type)) %>% ungroup
# id type
# <dbl> <chr>
# 1 1 x
# 2 1 x
# 3 1 y
# 4 2 x
# 5 2 x
# 6 2 x
# 7 2 y
# 8 3 x
# 9 3 x
#10 3 x
#11 3 x
#12 3 y
Run Code Online (Sandbox Code Playgroud)
match'y'将返回每组中第一名的位置。但是,如果没有'y'intype列,则此操作将会失败id。如果是这种情况,那么您可以使用filter如下所示的方法,如果组中没有'y'行,则返回该组的所有行。
df %>%
group_by(id) %>%
filter(lag(cumsum(type == 'y'), default = 0) < 1)
ungroup
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
560 次 |
| 最近记录: |