Jay*_*shi 7 python multithreading yield
是否可以在 map 函数中使用 yield ?
出于 POC 目的,我创建了一个示例代码段。
# Python 3 (Win10)
from concurrent.futures import ThreadPoolExecutor
import os
def read_sample(sample):
with open(os.path.join('samples', sample)) as fff:
for _ in range(10):
yield str(fff.read())
def main():
with ThreadPoolExecutor(10) as exc:
files = os.listdir('samples')
files = list(exc.map(read_sample, files))
print(str(len(files)), end="\r")
if __name__=="__main__":
main()
Run Code Online (Sandbox Code Playgroud)
我在示例文件夹中有 100 个文件。根据片段 100*10=1000 应该打印。但是,它只打印 100。当我检查它时只打印生成器对象。
有什么变化,它将被打印 1000?
您可以map()与生成器一起使用,但它只会尝试映射生成器对象,并且不会尝试下降到生成器本身。
一种可能的方法是让生成器按照您想要的方式进行循环,并让函数对对象进行操作。这样做的另一个优点是可以更巧妙地将循环与计算分开。所以,这样的事情应该有效:
# Python 3 (Win10)
from concurrent.futures import ThreadPoolExecutor
import os
def read_samples(samples):
for sample in samples:
with open(os.path.join('samples', sample)) as fff:
for _ in range(10):
yield fff
def main():
with ThreadPoolExecutor(10) as exc:
files = os.listdir('samples')
files = list(exc.map(lambda x: str(x.read()), read_samples(files)))
print(str(len(files)), end="\r")
if __name__=="__main__":
main()
Run Code Online (Sandbox Code Playgroud)
另一种方法是嵌套一个额外的map调用来使用生成器:
# Python 3 (Win10)
from concurrent.futures import ThreadPoolExecutor
import os
def read_samples(samples):
for sample in samples:
with open(os.path.join('samples', sample)) as fff:
for _ in range(10):
yield fff
def main():
with ThreadPoolExecutor(10) as exc:
files = os.listdir('samples')
files = exc.map(list, exc.map(lambda x: str(x.read())), read_samples(files))
files = [f for fs in files for f in fs] # flattening the results
print(str(len(files)), end="\r")
if __name__=="__main__":
main()
Run Code Online (Sandbox Code Playgroud)
只是为了获得一些更具可重复性的示例,可以用更简单的示例编写代码的特征(不依赖于系统上的文件):
from concurrent.futures import ThreadPoolExecutor
def foo(n):
for i in range(n):
yield i
with ThreadPoolExecutor(10) as exc:
x = list(exc.map(foo, range(k)))
print(x)
# [<generator object foo at 0x7f1a853d4518>, <generator object foo at 0x7f1a852e9990>, <generator object foo at 0x7f1a852e9db0>, <generator object foo at 0x7f1a852e9a40>, <generator object foo at 0x7f1a852e9830>, <generator object foo at 0x7f1a852e98e0>, <generator object foo at 0x7f1a852e9fc0>, <generator object foo at 0x7f1a852e9e60>]
Run Code Online (Sandbox Code Playgroud)
from concurrent.futures import ThreadPoolExecutor
def foos(ns):
for n in range(ns):
for i in range(n):
yield i
with ThreadPoolExecutor(10) as exc:
k = 8
x = list(exc.map(lambda x: x ** 2, foos(k)))
print(x)
# [0, 0, 1, 0, 1, 4, 0, 1, 4, 9, 0, 1, 4, 9, 16, 0, 1, 4, 9, 16, 25, 0, 1, 4, 9, 16, 25, 36]
Run Code Online (Sandbox Code Playgroud)
from concurrent.futures import ThreadPoolExecutor
def foo(n):
for i in range(n):
yield i ** 2
with ThreadPoolExecutor(10) as exc:
k = 8
x = exc.map(list, exc.map(foo, range(k)))
print([z for y in x for z in y])
# [0, 0, 1, 0, 1, 4, 0, 1, 4, 9, 0, 1, 4, 9, 16, 0, 1, 4, 9, 16, 25, 0, 1, 4, 9, 16, 25, 36]
Run Code Online (Sandbox Code Playgroud)