如果第一个元素是重复的,则从列表中删除元组的大多数pythonic方法

Dav*_*han 2 python tuples list

到目前为止我的代码非常难看:

orig = [(1,2),(1,3),(2,3),(3,3)]
previous_elem = []
unique_tuples = []
for tuple in orig:
    if tuple[0] not in previous_elem:
        unique_tuples += [tuple]
    previous_elem += [tuple[0]]
assert unique_tuples == [(1,2),(2,3),(3,3)]
Run Code Online (Sandbox Code Playgroud)

必须有更多的pythonic解决方案.

mir*_*ulo 8

如果你不关心你为重复项返回哪个元组,你总是可以将列表转换为字典并返回:

>>> orig = [(1,2),(1,3),(2,3),(3,3)]
>>> list(dict(orig).items())
[(1, 3), (2, 3), (3, 3)]
Run Code Online (Sandbox Code Playgroud)

如果你想返回第一个元组回合,你可以反转你的列表两次并使用OrderedDict,如下所示:

>>> from collections import OrderedDict
>>> orig = [(1,2),(1,3),(2,3),(3,3)]
>>> new = list(OrderedDict(orig[::-1]).items())[::-1]
[(1, 2), (2, 3), (3, 3)]
Run Code Online (Sandbox Code Playgroud)

这些并不是最有效的解决方案(如果这非常重要),但它们确实能够成为很好的惯用语.


一些基准测试

注意速度的差异,如果你不关心你返回哪个元组,第一个选项更有效:

>>> import timeit
>>> setup = '''
orig = [(1,2),(1,3),(2,3),(3,3)]
'''
>>> print (min(timeit.Timer('(list(dict(orig).items()))', setup=setup).repeat(7, 1000)))
0.0015771419037069459
Run Code Online (Sandbox Code Playgroud)

相比

>>>setup = '''
orig = [(1,2),(1,3),(2,3),(3,3)]
from collections import OrderedDict
'''
>>> print (min(timeit.Timer('(list(OrderedDict(orig[::-1]).items())[::-1])', 
             setup=setup).repeat(7, 1000)))
0.024554947372323
Run Code Online (Sandbox Code Playgroud)

根据这些速度测试,第一个选项快了近15倍.

然而,正如所说的那样,Saksham的回答也是O(n)并且有效地打破了这些词典方法:

>>> setup = '''
orig = [(1,2),(1,3),(2,3),(3,3)]
newlist = []
seen = set()
def fun():
    for (a, b) in orig:
        if not a in seen:
            newlist.append((a, b))
            seen.add(a)
    return newlist
'''
>>> print (min(timeit.Timer('fun()', setup=setup).repeat(7, 1000)))
0.0004833390384996095
Run Code Online (Sandbox Code Playgroud)