Python - 在这种情况下列表理解是有效的吗?

wol*_*ang 12 python

这是python中输入的"脏"列表

input_list = ['  \n  ','  data1\n ','   data2\n','  \n','data3\n'.....]
Run Code Online (Sandbox Code Playgroud)

每个列表元素包含带有新行字符的空格或带有换行字符的数据

使用以下代码清理它..

cleaned_up_list = [data.strip() for data in input_list if data.strip()]
Run Code Online (Sandbox Code Playgroud)

  cleaned_up_list =   ['data1','data2','data3','data4'..]
Run Code Online (Sandbox Code Playgroud)

strip()在上面的列表理解期间,python是否内部调用了两次?或者我是否必须使用for循环迭代,strip()如果我关心效率,我只需要一次?

for data in input_list
  clean_data = data.strip()
     if(clean_data):
         cleaned_up_list.append(clean_data)
Run Code Online (Sandbox Code Playgroud)

Pad*_*ham 14

使用你的列表comp strip 调用两次,如果你只想调用一次strip并保持理解,则使用gen exp:

input_list[:] = [x for x in (s.strip() for s in input_list) if x]
Run Code Online (Sandbox Code Playgroud)

输入:

input_list = ['  \n  ','  data1\n ','   data2\n','  \n','data3\n']
Run Code Online (Sandbox Code Playgroud)

输出:

 ['data1', 'data2', 'data3']
Run Code Online (Sandbox Code Playgroud)

input_list[:]将更改原始列表,如果您确实想要创建一个新列表,可能会或可能不是您想要的cleaned_up_list = ....

我总是发现itertools.imap在python 2和mappython 3中使用而不是生成器对于更大的输入是最有效的:

from itertools import imap
input_list[:] = [x for x in imap(str.strip, input_list) if x]
Run Code Online (Sandbox Code Playgroud)

一些时间采用不同的方法:

In [17]: input_list = [choice(input_list) for _ in range(1000000)]   

In [19]: timeit filter(None, imap(str.strip, input_list))
10 loops, best of 3: 115 ms per loop

In [20]: timeit list(ifilter(None,imap(str.strip,input_list)))
10 loops, best of 3: 110 ms per loop

In [21]: timeit [x for x in imap(str.strip,input_list) if x]
10 loops, best of 3: 125 ms per loop

In [22]: timeit [x for x in (s.strip() for s in input_list) if x]  
10 loops, best of 3: 145 ms per loop

In [23]: timeit [data.strip() for data in input_list if data.strip()]
10 loops, best of 3: 160 ms per loop

In [24]: %%timeit                                                
   ....:     cleaned_up_list = []
   ....:     for data in input_list:
   ....:          clean_data = data.strip()
   ....:          if clean_data:
   ....:              cleaned_up_list.append(clean_data)
   ....: 

10 loops, best of 3: 150 ms per loop

In [25]: 

In [25]: %%timeit                                                    
   ....:     cleaned_up_list = []
   ....:     append = cleaned_up_list.append
   ....:     for data in input_list:
   ....:          clean_data = data.strip()
   ....:          if clean_data:
   ....:              append(clean_data)
   ....: 

10 loops, best of 3: 123 ms per loop
Run Code Online (Sandbox Code Playgroud)

最快的方法实际上是itertools.ifilter与组合itertools.imap紧随其后的是filterimap.

无需重新评估函数引用,list.append每次迭代都更有效,如果您遇到循环并想要最有效的方法,那么它是一个可行的替代方案.