使用命名空间和共享内存字典时关闭管理器错误“AttributeError:'ForkAwareLocal'对象没有属性'连接'”

Soc*_*key 4 python multiprocessing python-3.x

我在尝试着:

  1. 在进程之间共享数据帧
  2. 根据对该数据帧执行的计算(但不更改)更新共享字典

我正在使用 a在共享内存中multiprocessing.Manager()创建一个(用于存储结果),并使用 a来存储/共享我想要读取的数据帧。dictNamespace

import multiprocessing

import pandas as pd
import numpy as np


def add_empty_dfs_to_shared_dict(shared_dict, key):
    shared_dict[key] = pd.DataFrame()


def edit_df_in_shared_dict(shared_dict, namespace, ind):
    row_to_insert = namespace.df.loc[ind]
    df = shared_dict[ind]
    df[ind] = row_to_insert
    shared_dict[ind] = df


if __name__ == '__main__':
    manager = multiprocessing.Manager()
    shared_dict = manager.dict()
    namespace = manager.Namespace()

    n = 100
    dataframe_to_be_shared = pd.DataFrame({
        'player_id': list(range(n)),
        'data': np.random.random(n),
    }).set_index('player_id')

    namespace.df = dataframe_to_be_shared

    for i in range(n):
        add_empty_dfs_to_shared_dict(shared_dict, i)

    jobs = []
    for i in range(n):
        p = multiprocessing.Process(
            target=edit_df_in_shared_dict,
            args=(shared_dict, namespace, i)
        )
        jobs.append(p)
        p.start()

    for p in jobs:
        p.join()

    print(shared_dict[1])

Run Code Online (Sandbox Code Playgroud)

shared_dict运行上述代码时,当我的打印语句使用一些数据执行时,它会正确写入。我还收到有关经理的错误:

Process Process-88:
Traceback (most recent call last):
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/managers.py", line 788, in _callmethod
    conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/henrysorsky/Library/Preferences/PyCharm2019.2/scratches/scratch_13.py", line 34, in edit_df_in_shared_dict
    row_to_insert = namespace.df.loc[ind]
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/managers.py", line 1099, in __getattr__
    return callmethod('__getattribute__', (key,))
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/managers.py", line 792, in _callmethod
    self._connect()
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/managers.py", line 779, in _connect
    conn = self._Client(self._token.address, authkey=self._authkey)
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/connection.py", line 492, in Client
    c = SocketClient(address)
  File "/Users/henrysorsky/.pyenv/versions/3.7.3/lib/python3.7/multiprocessing/connection.py", line 619, in SocketClient
    s.connect(address)
ConnectionRefusedError: [Errno 61] Connection refused
Run Code Online (Sandbox Code Playgroud)

我知道这是来自经理的,似乎是因为它没有正确关闭。我在网上唯一能找到的类似问题:

python服务器中进程之间共享列表

建议加入所有子进程,我已经在这样做了。

Dan*_*iel 5

问题可能出在您创建shared dict. 如果您忘记在主进程中使用process.join()(或无限循环),则主进程可能会在使用该字典的其他进程之前完成。这样字典就会被破坏,进程就无法连接到它。

进程数量应该不是问题。您应该能够根据需要使用该字典。