如何从并行进程中运行的函数中检索值？

Question

如何从并行进程中运行的函数中检索值？

Pou*_*uJa 7 python parallel-processing multiprocessing python-3.x python-multiprocessing

多处理模块对于python初学者来说非常困惑,特别是那些刚刚从MATLAB迁移并且使用并行计算工具箱变得懒惰的人.我有以下功能需要大约80秒运行,我想通过使用Python的多处理模块来缩短这个时间.

from time import time

xmax   = 100000000

start = time()
for x in range(xmax):
    y = ((x+5)**2+x-40)
    if y <= 0xf+1:
        print('Condition met at: ', y, x)
end  = time()
tt   = end-start #total time
print('Each iteration took: ', tt/xmax)
print('Total time:          ', tt)

Run Code Online (Sandbox Code Playgroud)

这按预期输出:

Condition met at:  -15 0
Condition met at:  -3 1
Condition met at:  11 2
Each iteration took:  8.667453265190124e-07
Total time:           86.67453265190125

Run Code Online (Sandbox Code Playgroud)

由于循环的任何迭代都不依赖于其他循环,我尝试从官方文档中采用此服务器进程来在单独的进程中扫描范围的块.最后我想出了vartec对这个问题的回答,可以准备以下代码.我还根据Darkonaut对当前问题的回答更新了代码.

from time import time 
import multiprocessing as mp

def chunker (rng, t): # this functions makes t chunks out of rng
    L  = rng[1] - rng[0]
    Lr = L % t
    Lm = L // t
    h  = rng[0]-1
    chunks = []
    for i in range(0, t):
        c  = [h+1, h + Lm]
        h += Lm
        chunks.append(c)
    chunks[t-1][1] += Lr + 1
    return chunks

def worker(lock, xrange, return_dict):
    '''worker function'''
    for x in range(xrange[0], xrange[1]):
        y = ((x+5)**2+x-40)
        if y <= 0xf+1:
            print('Condition met at: ', y, x)
            return_dict['x'].append(x)
            return_dict['y'].append(y)
            with lock:                
                list_x = return_dict['x']
                list_y = return_dict['y']
                list_x.append(x)
                list_y.append(y)
                return_dict['x'] = list_x
                return_dict['y'] = list_y

if __name__ == '__main__':
    start = time()
    manager = mp.Manager()
    return_dict = manager.dict()
    lock = manager.Lock()
    return_dict['x']=manager.list()
    return_dict['y']=manager.list()
    xmax = 100000000
    nw = mp.cpu_count()
    workers = list(range(0, nw))
    chunks = chunker([0, xmax], nw)
    jobs = []
    for i in workers:
        p = mp.Process(target=worker, args=(lock, chunks[i],return_dict))
        jobs.append(p)
        p.start()

    for proc in jobs:
        proc.join()
    end = time()
    tt   = end-start #total time
    print('Each iteration took: ', tt/xmax)
    print('Total time:          ', tt)
    print(return_dict['x'])
    print(return_dict['y'])

Run Code Online (Sandbox Code Playgroud)

这大大减少了运行时间到~17秒.但是,我的共享变量无法检索任何值.请帮我找出代码的哪个部分出错了.

我得到的输出是:

Each iteration took:  1.7742713451385497e-07
Total time:           17.742713451385498
[]
[]

Run Code Online (Sandbox Code Playgroud)

从中我期望:

Each iteration took:  1.7742713451385497e-07
Total time:           17.742713451385498
[0, 1, 2]
[-15, -3, 11]

Run Code Online (Sandbox Code Playgroud)

Answer 1

Dar*_*aut 3

您的示例中的问题是对标准可变结构的修改Manager.dict不会被传播。我首先向您展示如何与经理一起修复它，只是为了向您展示更好的选择。

multiprocessing.Manager有点重，因为它仅使用单独的进程，Manager并且在共享对象上工作需要使用锁来保证数据一致性。如果您在一台机器上运行它，则有更好的选择multiprocessing.Pool，以防您不必运行自定义Process类，如果必须运行，multiprocessing.Process则使用multiprocessing.Queue是执行此操作的常用方法。

引用部分来自多处理文档。

经理

如果标准（非代理）列表或字典对象包含在引用对象中，则对这些可变值的修改将不会通过管理器传播，因为代理无法知道其中包含的值何时被修改。然而，在容器代理中存储一个值（这会触发代理对象上的setitem）确实会通过管理器传播，因此为了有效地修改这样的项目，可以将修改后的值重新分配给容器代理...

在你的情况下，这看起来像：

def worker(xrange, return_dict, lock):
    """worker function"""
    for x in range(xrange[0], xrange[1]):
        y = ((x+5)**2+x-40)
        if y <= 0xf+1:
            print('Condition met at: ', y, x)
            with lock:
                list_x = return_dict['x']
                list_y = return_dict['y']
                list_x.append(x)
                list_y.append(y)
                return_dict['x'] = list_x
                return_dict['y'] = list_y

Run Code Online (Sandbox Code Playgroud)

这里lock将是一个manager.Lock您必须作为参数传递的实例，因为整个（现在）锁定操作本身并不是原子的。（这是一个使用 Lock 的更简单的示例Manager）

在大多数用例中，这种方法可能不如使用嵌套代理对象方便，但也演示了对同步的一定程度的控制。

由于 Python 3.6 代理对象是可嵌套的：

版本 3.6 中的更改：共享对象可以嵌套。例如，共享容器对象（例如共享列表）可以包含其他共享对象，这些对象都将由 SyncManager 管理和同步。

从 Python 3.6 开始，您可以manager.dict在开始多处理之前使用manager.listas 值填充，然后直接附加到工作线程中，而无需重新分配。

return_dict['x'] = manager.list()
return_dict['y'] = manager.list()

Run Code Online (Sandbox Code Playgroud)

编辑：

这是完整的示例Manager：

import time
import multiprocessing as mp
from multiprocessing import Manager, Process
from contextlib import contextmanager
# mp_util.py from first link in code-snippet for "Pool"
# section below
from mp_utils import calc_batch_sizes, build_batch_ranges

# def context_timer ... see code snippet in "Pool" section below

def worker(batch_range, return_dict, lock):
    """worker function"""
    for x in batch_range:
        y = ((x+5)**2+x-40)
        if y <= 0xf+1:
            print('Condition met at: ', y, x)
            with lock:
                return_dict['x'].append(x)
                return_dict['y'].append(y)


if __name__ == '__main__':

    N_WORKERS = mp.cpu_count()
    X_MAX = 100000000

    batch_sizes = calc_batch_sizes(X_MAX, n_workers=N_WORKERS)
    batch_ranges = build_batch_ranges(batch_sizes)
    print(batch_ranges)

    with Manager() as manager:
        lock = manager.Lock()
        return_dict = manager.dict()
        return_dict['x'] = manager.list()
        return_dict['y'] = manager.list()

        tasks = [(batch_range, return_dict, lock)
                 for batch_range in batch_ranges]

        with context_timer():

            pool = [Process(target=worker, args=args)
                    for args in tasks]

            for p in pool:
                p.start()
            for p in pool:
                p.join()

        # Create standard container with data from manager before exiting
        # the manager.
        result = {k: list(v) for k, v in return_dict.items()}

    print(result)

Run Code Online (Sandbox Code Playgroud)

水池

大多数情况下，意志multiprocessing.Pool就会去做。由于您希望将迭代分布在一个范围内，因此您在示例中面临额外的挑战。即使每个进程都有大约相同的工作要做，您的chunker函数也无法划分范围：

chunker((0, 21), 4)
# Out: [[0, 4], [5, 9], [10, 14], [15, 21]]  # 4, 4, 4, 6!

Run Code Online (Sandbox Code Playgroud)

对于下面的代码，请mp_utils.py从我的答案中获取代码片段，它提供了两个函数来尽可能地对范围进行分块。

您multiprocessing.Pool的worker函数只需返回结果，并Pool负责将结果通过内部队列传输回父进程。这result将是一个列表，因此您必须以您希望的方式再次重新排列结果。您的示例可能如下所示：

import time
import multiprocessing as mp
from multiprocessing import Pool
from contextlib import contextmanager
from itertools import chain

from mp_utils import calc_batch_sizes, build_batch_ranges

@contextmanager
def context_timer():
    start_time = time.perf_counter()
    yield
    end_time = time.perf_counter()
    total_time   = end_time-start_time
    print(f'\nEach iteration took: {total_time / X_MAX:.4f} s')
    print(f'Total time:          {total_time:.4f} s\n')


def worker(batch_range):
    """worker function"""
    result = []
    for x in batch_range:
        y = ((x+5)**2+x-40)
        if y <= 0xf+1:
            print('Condition met at: ', y, x)
            result.append((x, y))
    return result


if __name__ == '__main__':

    N_WORKERS = mp.cpu_count()
    X_MAX = 100000000

    batch_sizes = calc_batch_sizes(X_MAX, n_workers=N_WORKERS)
    batch_ranges = build_batch_ranges(batch_sizes)
    print(batch_ranges)

    with context_timer():
        with Pool(N_WORKERS) as pool:
            results = pool.map(worker, iterable=batch_ranges)

    print(f'results: {results}')
    x, y = zip(*chain.from_iterable(results))  # filter and sort results
    print(f'results sorted: x: {x}, y: {y}')

Run Code Online (Sandbox Code Playgroud)

示例输出：

[range(0, 12500000), range(12500000, 25000000), range(25000000, 37500000), 
range(37500000, 50000000), range(50000000, 62500000), range(62500000, 75000000), range(75000000, 87500000), range(87500000, 100000000)]
Condition met at:  -15 0
Condition met at:  -3 1
Condition met at:  11 2

Each iteration took: 0.0000 s
Total time:          8.2408 s

results: [[(0, -15), (1, -3), (2, 11)], [], [], [], [], [], [], []]
results sorted: x: (0, 1, 2), y: (-15, -3, 11)

Process finished with exit code 0

Run Code Online (Sandbox Code Playgroud)

如果您有多个参数，worker您将构建一个带有参数元组的“任务”列表，并pool.map(...)与pool.starmap(...iterable=tasks). 有关更多详细信息，请参阅文档。

进程和队列

如果由于某种原因无法使用multiprocessing.Pool，则必须自己处理进程间通信（IPC），方法是将 as multiprocessing.Queue参数传递给子进程中的工作函数，并让它们将结果排入队列以发送回给家长。

您还必须构建类似池的结构，以便可以迭代它来启动和加入进程，并且必须get()从队列返回结果。Queue.get我在这里写了更多关于用法的信息。

采用这种方法的解决方案可能如下所示：

def worker(result_queue, batch_range):
    """worker function"""
    result = []
    for x in batch_range:
        y = ((x+5)**2+x-40)
        if y <= 0xf+1:
            print('Condition met at: ', y, x)
            result.append((x, y))
    result_queue.put(result)  # <--


if __name__ == '__main__':

    N_WORKERS = mp.cpu_count()
    X_MAX = 100000000

    result_queue = mp.Queue()  # <--
    batch_sizes = calc_batch_sizes(X_MAX, n_workers=N_WORKERS)
    batch_ranges = build_batch_ranges(batch_sizes)
    print(batch_ranges)

    with context_timer():

        pool = [Process(target=worker, args=(result_queue, batch_range))
                for batch_range in batch_ranges]

        for p in pool:
            p.start()

        results = [result_queue.get() for _ in batch_ranges]

        for p in pool:
            p.join()

    print(f'results: {results}')
    x, y = zip(*chain.from_iterable(results))  # filter and sort results
    print(f'results sorted: x: {x}, y: {y}')

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，9 月前
查看次数：	139 次
最近记录：	6 年，8 月前