将Python列表拆分为重叠块列表

efi*_*ida 10 python

这个问题类似于将列表切割成子列表列表,但在我的情况下,我希望包含每个前一个子列表的最后一个元素,作为下一个子列表中的第一个元素.并且必须考虑到最后一个元素总是至少有两个元素.

例如:

list_ = ['a','b','c','d','e','f','g','h']
Run Code Online (Sandbox Code Playgroud)

3号子列表的结果:

resultant_list = [['a','b','c'],['c','d','e'],['e','f','g'],['g','h']]
Run Code Online (Sandbox Code Playgroud)

wim*_*wim 14

您链接答案中的列表理解很容易通过简单地缩短传递给范围的"step"参数来支持重叠块:

>>> list_ = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> n = 3  # group size
>>> m = 1  # overlap size
>>> [list_[i:i+n] for i in range(0, len(list_), n-m)]
[['a', 'b', 'c'], ['c', 'd', 'e'], ['e', 'f', 'g'], ['g', 'h']]
Run Code Online (Sandbox Code Playgroud)

此问题的其他访问者可能无法使用输入列表(可切片,已知长度,有限).这是一个基于生成器的解决方案,可以处理任意迭代:

from collections import deque

def chunks(iterable, chunk_size=3, overlap=0):
    # we'll use a deque to hold the values because it automatically
    # discards any extraneous elements if it grows too large
    if chunk_size < 1:
        raise Exception("chunk size too small")
    if overlap >= chunk_size:
        raise Exception("overlap too large")
    queue = deque(maxlen=chunk_size)
    it = iter(iterable)
    i = 0
    try:
        # start by filling the queue with the first group
        for i in range(chunk_size):
            queue.append(next(it))
        while True:
            yield tuple(queue)
            # after yielding a chunk, get enough elements for the next chunk
            for i in range(chunk_size - overlap):
                queue.append(next(it))
    except StopIteration:
        # if the iterator is exhausted, yield any remaining elements
        i += overlap
        if i > 0:
            yield tuple(queue)[-i:]
Run Code Online (Sandbox Code Playgroud)

注意:此实现是从中复制的wimpy.util.chunks.如果您不介意添加依赖项,则可以pip install wimpy使用from wimpy import chunks而不是复制粘贴代码.

  • 此方法可能会导致留下不必要的“存根”,例如,如果您在 `['a'、'b'、'c'、'd'、'e'、'f'、'g 上运行第一个示例']`,它产生 `[['a', 'b', 'c'], ['c', 'd', 'e'], ['e', 'f', 'g'], ['g']]`。为了避免此类不必要的块仅包含先前块中已捕获的元素,请在计算范围时从列表长度中减去重叠大小“m”,即“[list_[i:i+n] for i in range(0, len”) (list_)-m, nm)]` (3认同)

pyl*_*ang 8

more_itertools 有一个用于重叠迭代的窗口工具。

给定的

import more_itertools as mit


iterable = list("abcdefgh")
iterable
# ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Run Code Online (Sandbox Code Playgroud)

代码

windows = list(mit.windowed(iterable, n=3, step=2))
windows
# [('a', 'b', 'c'), ('c', 'd', 'e'), ('e', 'f', 'g'), ('g', 'h', None)]
Run Code Online (Sandbox Code Playgroud)

如果需要,您可以None通过过滤窗口来删除填充值:

[list(filter(None, w)) for w in windows]
# [['a', 'b', 'c'], ['c', 'd', 'e'], ['e', 'f', 'g'], ['g', 'h']]
Run Code Online (Sandbox Code Playgroud)

有关详细信息,另请参阅 more_itertools文档more_itertools.windowed