查找连续的重复项并列出它们在 python 中出现位置的索引

D.M*_*ann 6 python numpy python-3.x

例如,我有一个 python 列表:

mylist = [1,1,1,1,1,1,1,1,1,1,1,
        0,0,1,1,1,1,0,0,0,0,0,
        1,1,1,1,1,1,1,1,0,0,0,0,0,0]
Run Code Online (Sandbox Code Playgroud)

我的目标是找到连续有五个或更多零的位置,然后列出发生这种情况的索引,例如,输出为:

[17,21][30,35]
Run Code Online (Sandbox Code Playgroud)

这是我在此处提出的其他问题中尝试/看到的内容:

def zero_runs(a):
    # Create an array that is 1 where a is 0, and pad each end with an extra 0.
    iszero = np.concatenate(([0], np.equal(a, 0).view(np.int8), [0]))
    absdiff = np.abs(np.diff(iszero))
    # Runs start and end where absdiff is 1.
    ranges = np.where(absdiff == 1)[0].reshape(-1, 2)
    return ranges

    runs = zero_runs(list)
Run Code Online (Sandbox Code Playgroud)

这给出了输出:

[0,10]
[11,12]
...
Run Code Online (Sandbox Code Playgroud)

这基本上只是列出所有重复项的索引,我将如何将此数据分离为我需要的数据

Dan*_*ejo 5

您可以使用itertools.groupby,它将识别列表中的连续组:

from itertools import groupby

lst = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]

groups = [(k, sum(1 for _ in g)) for k, g in groupby(lst)]

cursor = 0
result = []
for k, l in groups:
    if not k and l >= 5:
        result.append([cursor, cursor + l - 1])
    cursor += l

print(result)
Run Code Online (Sandbox Code Playgroud)

输出

[[17, 21], [30, 35]]
Run Code Online (Sandbox Code Playgroud)