XZS*_*XZS 10 python functional-programming python-itertools
我正在寻找Python zip和zip_longest函数之间的中间点(来自itertools模块),它耗尽了所有给定的迭代器,但没有填充任何东西.因此,例如,它应该像这样转置元组:
(11, 12, 13 ), (11, 21, 31, 41),
(21, 22, 23, 24), --> (12, 22, 32, 42),
(31, 32 ), (13, 23, 43),
(41, 42, 43, 44), ( 24, 44)
Run Code Online (Sandbox Code Playgroud)
(添加了空格以实现更好的图形对齐.)
我设法通过清理fillvalues后来制定原始解决方案zip_longest.
def zip_discard(*iterables, sentinel = object()):
return map(
partial(filter, partial(is_not, sentinel)),
zip_longest(*iterables, fillvalue=sentinel))
Run Code Online (Sandbox Code Playgroud)
有没有办法在不引入哨兵的情况下做到这一点?可以使用改进yield吗?哪种方法最有效?
你接近是好的.我认为使用哨兵是优雅的.我可能会认为使用嵌套生成器表达式更加pythonic:
def zip_discard_gen(*iterables, sentinel=object()):
return ((entry for entry in iterable if entry is not sentinel)
for iterable in zip_longest(*iterables, fillvalue=sentinel))
Run Code Online (Sandbox Code Playgroud)
这需要更少的进口,因为不需要partial()或ne().
它也快一点:
data = [(11, 12, 13 ),
(21, 22, 23, 24),
(31, 32 ),
(41, 42, 43, 44)]
%timeit [list(x) for x in zip_discard(*data)]
10000 loops, best of 3: 17.5 µs per loop
%timeit [list(x) for x in zip_discard_gen(*data)]
100000 loops, best of 3: 14.2 µs per loop
Run Code Online (Sandbox Code Playgroud)
编辑
列表理解版本更快一点:
def zip_discard_compr(*iterables, sentinel=object()):
return [[entry for entry in iterable if entry is not sentinel]
for iterable in zip_longest(*iterables, fillvalue=sentinel)]
Run Code Online (Sandbox Code Playgroud)
定时:
%timeit zip_discard_compr(*data)
100000 loops, best of 3: 6.73 µs per loop
Run Code Online (Sandbox Code Playgroud)
Python 2版本:
from itertools import izip_longest
SENTINEL = object()
def zip_discard_compr(*iterables):
sentinel = SENTINEL
return [[entry for entry in iterable if entry is not sentinel]
for iterable in izip_longest(*iterables, fillvalue=sentinel)]
Run Code Online (Sandbox Code Playgroud)
此版本返回与zip_varlenTadhg McDonald-Jensen 相同的数据结构:
def zip_discard_gen(*iterables, sentinel=object()):
return (tuple([entry for entry in iterable if entry is not sentinel])
for iterable in zip_longest(*iterables, fillvalue=sentinel))
Run Code Online (Sandbox Code Playgroud)
它大约快两倍:
%timeit list(zip_discard_gen(*data))
100000 loops, best of 3: 9.37 µs per loop
%timeit list(zip_varlen(*data))
10000 loops, best of 3: 18 µs per loop
Run Code Online (Sandbox Code Playgroud)
两者zip和zip_longest旨在始终生成相等长度的元组,您可以使用以下内容定义自己的生成器,该生成器不关心len:
def _one_pass(iters):
for it in iters:
try:
yield next(it)
except StopIteration:
pass #of some of them are already exhausted then ignore it.
def zip_varlen(*iterables):
iters = [iter(it) for it in iterables]
while True: #broken when an empty tuple is given by _one_pass
val = tuple(_one_pass(iters))
if val:
yield val
else:
break
Run Code Online (Sandbox Code Playgroud)
如果压缩在一起的数据相当大,则每次跳过耗尽的迭代器可能会很昂贵,iters在这样的_one_pass函数中删除完成的迭代器可能会更有效:
def _one_pass(iters):
i = 0
while i<len(iters):
try:
yield next(iters[i])
except StopIteration:
del iters[i]
else:
i+=1
Run Code Online (Sandbox Code Playgroud)
这两个版本都无需创建中间结果或使用临时填充值。
| 归档时间: |
|
| 查看次数: |
2009 次 |
| 最近记录: |