Cha*_*hao 5 group-by r plyr dplyr data.table
我有一个data.frame如下所示,我想添加一个变量,描述VALUE在组中观察到的变量中最长的连续计数1 (即最长的连续行,VALUE每组1 个).
GROUP_ID VALUE
1 0
1 1
1 1
1 1
1 1
1 0
2 1
2 1
2 0
2 1
2 1
2 1
3 1
3 0
3 1
3 0
Run Code Online (Sandbox Code Playgroud)
所以输出看起来像这样:
GROUP_ID VALUE CONSECUTIVE
1 0 4
1 1 4
1 1 4
1 1 4
1 1 4
1 0 4
2 1 3
2 1 3
2 0 3
2 1 3
2 1 3
2 1 3
3 1 1
3 0 1
3 1 1
3 0 1
Run Code Online (Sandbox Code Playgroud)
任何帮助将不胜感激!
使用dplyr:
library(dplyr)
dat %>%
group_by(GROUP_ID) %>%
mutate(CONSECUTIVE = {rl <- rle(VALUE); max(rl$lengths[rl$values == 1])})
Run Code Online (Sandbox Code Playgroud)
这使:
Run Code Online (Sandbox Code Playgroud)# A tibble: 16 x 3 # Groups: GROUP_ID [3] GROUP_ID VALUE CONSECUTIVE <int> <int> <int> 1 1 0 4 2 1 1 4 3 1 1 4 4 1 1 4 5 1 1 4 6 1 0 4 7 2 1 3 8 2 1 3 9 2 0 3 10 2 1 3 11 2 1 3 12 2 1 3 13 3 1 1 14 3 0 1 15 3 1 1 16 3 0 1
或者使用data.table:
library(data.table)
setDT(dat) # convert to a 'data.table'
dat[, CONSECUTIVE := {rl <- rle(VALUE); max(rl$lengths[rl$values == 1])}
, by = GROUP_ID][]
Run Code Online (Sandbox Code Playgroud)