itertools.imap vs映射整个iterable

Joe*_*Joe 10 python python-2.x python-itertools

我很好奇http://docs.python.org/2/library/itertools.html#itertools.imap上的一条声明,即它描述了

sum(imap(operator.mul, vector1, vector2))
Run Code Online (Sandbox Code Playgroud)

作为一个有效的点产品.我的理解是imap给出了一个生成器而不是一个列表,虽然我理解如果你只考虑前几个元素,周围的sum(),它会更快/消耗更少的内存,我不知道如何它的行为与:

sum(map(operator.mul, vector1, vector2))
Run Code Online (Sandbox Code Playgroud)

Max*_*oel 20

之间的差异map,并imap当你开始增加你遍历什么的尺寸变得清晰:

# xrange object, takes up no memory
data = xrange(1000000000)

# Tries to builds a list of 1 billion elements!
# Therefore, fails with MemoryError on 32-bit systems.
doubled = map(lambda x: x * 2, data)

# Generator object that lazily doubles each item as it's iterated over.
# Takes up very little (and constant, independent of data's size) memory.
iter_doubled = itertools.imap(lambda x: x * 2, data)

# This is where the iteration and the doubling happen.
# Again, since no list is created, this doesn't run you out of memory.
sum(iter_doubled)

# (The result is 999999999000000000L, if you're interested.
# It takes a minute or two to compute, but consumes minimal memory.)
Run Code Online (Sandbox Code Playgroud)

请注意,在Python 3中,内置map行为类似于Python 2 itertools.imap(由于不再需要它而被删除).要获得"旧map"行为,您可以使用list(map(...)),这是另一种可视化Python 2的方式itertools.imap并且map彼此不同的好方法.


Bar*_*zKP 7

第一行将逐个计算累计项目的总和.第二个将首先计算整个点积,然后,将整个结果存储在内存中,它将继续计算总和.因此存在内存复杂性增益.