有限发电机的长度

dan*_*ast 7 python generator

我有这两个实现来计算有限生成器的长度,同时保留数据以供进一步处理:

def count_generator1(generator):
    '''- build a list with the generator data
       - get the length of the data
       - return both the length and the original data (in a list)
       WARNING: the memory use is unbounded, and infinite generators will block this'''
    l = list(generator)
    return len(l), l

def count_generator2(generator):
    '''- get two generators from the original generator
       - get the length of the data from one of them
       - return both the length and the original data, as returned by tee
       WARNING: tee can use up an unbounded amount of memory, and infinite generators will block this'''
    for_length, saved  = itertools.tee(generator, 2)
    return sum(1 for _ in for_length), saved
Run Code Online (Sandbox Code Playgroud)

两者都有缺点,都做到了.有人可以对它们发表评论,甚至可以提供更好的选择吗?

Gar*_*tty 13

如果你必须这样做,第一种方法要好得多 - 当你消耗所有值时,itertools.tee()无论如何都必须存储所有值,这意味着列表将更有效.

引用文档:

这个itertool可能需要大量的辅助存储(取决于需要存储多少临时数据).通常,如果一个迭代器在另一个迭代器启动之前使用大部分或全部数据,则使用list()而不是tee()会更快.