当列表长于进程数时,Python multiprocessing.Pool.map 的行为

208*_*080 5 python multiprocessing python-multiprocessing

当提交的任务列表长于进程数时,进程如何分配给这些任务?

from multiprocessing import Pool

def f(i):
    print(i)
    return i

with Pool(2) as pool:
    print(pool.map(f, [1, 2, 3, 4, 5]))
Run Code Online (Sandbox Code Playgroud)

我正在运行一个更复杂的函数,并且执行似乎不按顺序(先进先出)。

Ale*_*all 1

这是一些示例代码:

from multiprocessing import Pool
from time import sleep


def f(x):
    print(x)
    sleep(0.1)
    return x * x


if __name__ == '__main__':
    with Pool(2) as pool:
        print(pool.map(f, range(100)))
Run Code Online (Sandbox Code Playgroud)

打印出:

0
13
1
14
2
15
3
16
4
...
Run Code Online (Sandbox Code Playgroud)

如果我们查看相关源代码multiprocessing

    def _map_async(self, func, iterable, mapper, chunksize=None, callback=None,
            error_callback=None):
        '''
        Helper function to implement map, starmap and their async counterparts.
        '''
        self._check_running()
        if not hasattr(iterable, '__len__'):
            iterable = list(iterable)

        if chunksize is None:
            chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
            if extra:
                chunksize += 1
        if len(iterable) == 0:
            chunksize = 0

        task_batches = Pool._get_tasks(func, iterable, chunksize)
Run Code Online (Sandbox Code Playgroud)

这里我们有len(iterable) == 100, len(self._pool) * 4 == 8,因此chunksize, extra = 12, 4导致chunksize = 13,因此输出显示任务被分成 13 个批次。