如何确定r中长序列的最长连续序列

Gre*_*001 4 r data.table

我有一个序列作为玩具示例。如何确定最长的连续子序列?现在,我可以找到临界点在哪里,如何获得这些值?

DT <- data.table(X = c(3:7, 16:18, 22:29, 31:36))
DT[,Y:=(shift(.SD,type = "lag", fill = -1))][,Y:= Y-X]
with(DT, which(Y !=-1)) 
Run Code Online (Sandbox Code Playgroud)

我希望找到的是子序列的值,在这种情况下,应为 c(22, 23, 24, 25, 26, 27, 28, 29)

Ron*_*hah 5

不确定您的预期输出是什么,但是在这里我们将每个序列的长度添加到 data.table

library(data.table)
DT[, length := .N, by = cumsum(c(1, diff(X) != 1))]

DT
#     X length
# 1:  3      5
# 2:  4      5
# 3:  5      5
# 4:  6      5
# 5:  7      5
# 6: 16      3
# 7: 17      3
# 8: 18      3
# 9: 22      8
#10: 23      8
#11: 24      8
#12: 25      8
#13: 26      8
#14: 27      8
#15: 28      8
#16: 29      8
#17: 31      6
#18: 32      6
#19: 33      6
#20: 34      6
#21: 35      6
#22: 36      6
#     X length
Run Code Online (Sandbox Code Playgroud)

然后,如果您只想提取最大值,我们可以

DT[length == max(length), ]

#    X length
#1: 22      8
#2: 23      8
#3: 24      8
#4: 25      8
#5: 26      8
#6: 27      8
#7: 28      8
#8: 29      8
Run Code Online (Sandbox Code Playgroud)