在python中搜索嵌套列表的最有效方法是什么?

fds*_*dsa 8 python

我有一个包含嵌套列表的列表,我需要知道在这些嵌套列表中搜索的最有效方法.

例如,如果我有

[['a','b','c'],
['d','e','f']]
Run Code Online (Sandbox Code Playgroud)

我必须搜索上面的整个列表,找到'd'的最有效方法是什么?

Joh*_*ooy 11

>>> lis=[['a','b','c'],['d','e','f']]
>>> any('d' in x for x in lis)
True
Run Code Online (Sandbox Code Playgroud)

生成器表达使用 any

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "any('d' in x for x in lis)" 
1000000 loops, best of 3: 1.32 usec per loop
Run Code Online (Sandbox Code Playgroud)

发电机表达

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "'d' in (y for x in lis for y in x)"
100000 loops, best of 3: 1.56 usec per loop
Run Code Online (Sandbox Code Playgroud)

列表理解

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" "'d' in [y for x in lis for y in x]"
100000 loops, best of 3: 3.23 usec per loop
Run Code Online (Sandbox Code Playgroud)

如果物品接近结束或根本不存在怎么样?any比列表理解更快

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]"
    "'NOT THERE' in [y for x in lis for y in x]"
100000 loops, best of 3: 4.4 usec per loop

$ python -m timeit -s "lis=[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" 
    "any('NOT THERE' in x for x in lis)"
100000 loops, best of 3: 3.06 usec per loop
Run Code Online (Sandbox Code Playgroud)

也许如果列表长1000倍?any仍然更快

$ python -m timeit -s "lis=1000*[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]"
    "'NOT THERE' in [y for x in lis for y in x]"
100 loops, best of 3: 3.74 msec per loop
$ python -m timeit -s "lis=1000*[['a','b','c'],['d','e','f'],[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15],[16,17,18]]" 
    "any('NOT THERE' in x for x in lis)"
100 loops, best of 3: 2.48 msec per loop
Run Code Online (Sandbox Code Playgroud)

我们知道发电机需要一段时间来设置,因此LC获胜的最佳机会是一个非常短的列表

$ python -m timeit -s "lis=[['a','b','c']]"
    "any('c' in x for x in lis)"
1000000 loops, best of 3: 1.12 usec per loop
$ python -m timeit -s "lis=[['a','b','c']]"
    "'c' in [y for x in lis for y in x]"
1000000 loops, best of 3: 0.611 usec per loop
Run Code Online (Sandbox Code Playgroud)

并且any使用更少的内存


Lev*_*von 5

使用列表理解,给出:

mylist = [['a','b','c'],['d','e','f']]
'd' in [j for i in mylist for j in i]
Run Code Online (Sandbox Code Playgroud)

收益率:

True
Run Code Online (Sandbox Code Playgroud)

这也可以用发电机完成(如@AshwiniChaudhary所示)

根据以下评论进行更新:

这是相同的列表理解,但使用更多描述性的变量名称:

'd' in [elem for sublist in mylist for elem in sublist]
Run Code Online (Sandbox Code Playgroud)

列表推导部分中的循环结构等同于

for sublist in mylist:
   for elem in sublist
Run Code Online (Sandbox Code Playgroud)

并生成一个列表,其中"d"可以与in运营商一起进行测试.