Abh*_*jit 7 python list-comprehension python-itertools
在回答Clunky计算增量数字之间差异的问题时,是否有更美妙的方法?,我想出了两个解决方案,一个List Comprehension和另一个使用list comprehension.
对我来说,starmapSyntax看起来更清晰,可读,更简洁,更Pythonic.但仍然List Comprehension在itertools中可用,我想知道,必须有一个理由.
我的问题是什么时候There should be one-- and preferably only one --obvious way to do it.可以优先考虑LC?
注意如果它是Style的问题那么它肯定是矛盾的LC
头对头比较
可读性很重要.---starmap
它再次成为一种感知问题,但对我starmap来说更具可读性operator.要使用lambda,您需要导入multi-variable,或定义itertools或显式LC功能,然而从中进行额外导入List Comprehension.
表现 ---list comprehension
>>> def using_star_map(nums):
delta=starmap(sub,izip(nums[1:],nums))
return sum(delta)/float(len(nums)-1)
>>> def using_LC(nums):
delta=(x-y for x,y in izip(nums[1:],nums))
return sum(delta)/float(len(nums)-1)
>>> nums=[random.randint(1,10) for _ in range(100000)]
>>> t1=Timer(stmt='using_star_map(nums)',setup='from __main__ import nums,using_star_map;from itertools import starmap,izip')
>>> t2=Timer(stmt='using_LC(nums)',setup='from __main__ import nums,using_LC;from itertools import izip')
>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)
235.03 usec/pass
>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)
181.87 usec/pass
Run Code Online (Sandbox Code Playgroud)
Gar*_*tty 12
我通常看到的差异map()/ starmap()是最合适的,你是从字面上只是调用列表中的每一项功能.在这种情况下,他们更清楚一点:
(f(x) for x in y)
map(f, y) # itertools.imap(f, y) in 2.x
(f(*x) for x in y)
starmap(f, y)
Run Code Online (Sandbox Code Playgroud)
一旦你开始需要投入lambda或者filter同样,你应该切换到列表comp/generator表达式,但是在它是单个函数的情况下,语法对于列表推导的生成器表达式感觉非常冗长.
它们是可以互换的,如果有疑问的话,它会坚持生成器表达式,因为它通常更具可读性,但在一个简单的情况下(map(int, strings),starmap(Vector, points))使用map()/ starmap() 有时可以使事情更容易阅读.
我认为starmap()更具可读性的一个例子:
from collections import namedtuple
from itertools import starmap
points = [(10, 20), (20, 10), (0, 0), (20, 20)]
Vector = namedtuple("Vector", ["x", "y"])
for vector in (Vector(*point) for point in points):
...
for vector in starmap(Vector, points):
...
Run Code Online (Sandbox Code Playgroud)
并为map():
values = ["10", "20", "0"]
for number in (int(x) for x in values):
...
for number in map(int, values):
...
Run Code Online (Sandbox Code Playgroud)
python -m timeit -s "from itertools import starmap" -s "from operator import sub" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(sub, numbers))"
1000000 loops, best of 3: 0.258 usec per loop
python -m timeit -s "numbers = zip(range(100000), range(100000))" "sum(x-y for x, y in numbers)"
1000000 loops, best of 3: 0.446 usec per loop
Run Code Online (Sandbox Code Playgroud)
用于构建namedtuple:
python -m timeit -s "from itertools import starmap" -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "list(starmap(Vector, numbers))"
1000000 loops, best of 3: 0.98 usec per loop
python -m timeit -s "from collections import namedtuple" -s "numbers = zip(range(100000), reversed(range(100000)))" -s "Vector = namedtuple('Vector', ['x', 'y'])" "[Vector(*pos) for pos in numbers]"
1000000 loops, best of 3: 0.375 usec per loop
Run Code Online (Sandbox Code Playgroud)
在我的测试中,我们谈论使用简单函数(不lambda),starmap()比同等生成器表达式更快.当然,性能应该落后于可读性,除非它是一个经证实的瓶颈.
如何lambda杀死任何性能增益的示例,与第一组中的示例相同,但lambda代之以operator.sub():
python -m timeit -s "from itertools import starmap" -s "numbers = zip(range(100000), range(100000))" "sum(starmap(lambda x, y: x-y, numbers))"
1000000 loops, best of 3: 0.546 usec per loop
Run Code Online (Sandbox Code Playgroud)
这很大程度上是一种风格。选择您认为更具可读性的内容。
\n\n关于“只有一种方法可以做到这一点”,Sven Marnach 善意地提供了Guido 的这句话:
\n\n\n\n\n\xe2\x80\x9c你可能认为这违反了 TOOWTDI,但正如我之前所说,这是一个善意的谎言(也是对 2000 年左右 Perl 口号的厚颜无耻的回应)。能够(向人类读者)表达意图通常需要在多种形式之间进行选择,这些形式基本上执行相同的操作,但对读者来说看起来不同。\xe2\x80\x9d
\n
在性能热点中,您可能希望选择运行速度最快的解决方案(我猜在这种情况下将是starmap基础解决方案)。
在性能方面——星图由于其解构而较慢;然而星图在这里不是必需的:
\n\nfrom timeit import Timer\nimport random\nfrom itertools import starmap, izip,imap\nfrom operator import sub\n\ndef using_imap(nums):\n delta=imap(sub,nums[1:],nums[:-1])\n return sum(delta)/float(len(nums)-1)\n\ndef using_LC(nums):\n delta=(x-y for x,y in izip(nums[1:],nums))\n return sum(delta)/float(len(nums)-1)\n\nnums=[random.randint(1,10) for _ in range(100000)]\nt1=Timer(stmt=\'using_imap(nums)\',setup=\'from __main__ import nums,using_imap\')\nt2=Timer(stmt=\'using_LC(nums)\',setup=\'from __main__ import nums,using_LC\')\nRun Code Online (Sandbox Code Playgroud)\n\n在我的电脑上:
\n\n>>> print "%.2f usec/pass" % (1000000 * t1.timeit(number=1000)/100000)\n172.86 usec/pass\n>>> print "%.2f usec/pass" % (1000000 * t2.timeit(number=1000)/100000)\n178.62 usec/pass\nRun Code Online (Sandbox Code Playgroud)\n\nimap出来的速度稍微快一点,可能是因为它避免了压缩/解构。