多处理会导致Python崩溃并在调用fork()时在另一个线程中发生错误

Sri*_*rri 37 python multithreading python-3.x

我是Python的新手,并试图为我的for循环实现一个多处理模块.

我有一个存储在img_urls中的Image url数组,我需要下载并应用一些Google愿景.

if __name__ == '__main__':

    img_urls = [ALL_MY_Image_URLS]
    runAll(img_urls)
    print("--- %s seconds ---" % (time.time() - start_time)) 
Run Code Online (Sandbox Code Playgroud)

这是我的runAll()方法

def runAll(img_urls):
    num_cores = multiprocessing.cpu_count()

    print("Image URLS  {}",len(img_urls))
    if len(img_urls) > 2:
        numberOfImages = 0
    else:
        numberOfImages = 1

    start_timeProcess = time.time()

    pool = multiprocessing.Pool()
    pool.map(annotate,img_urls)
    end_timeProcess = time.time()
    print('\n Time to complete ', end_timeProcess-start_timeProcess)

    print(full_matching_pages)


def annotate(img_path):
    file =  requests.get(img_path).content
    print("file is",file)
    """Returns web annotations given the path to an image."""
    print('Process Working under ',os.getpid())
    image = types.Image(content=file)
    web_detection = vision_client.web_detection(image=image).web_detection
    report(web_detection)
Run Code Online (Sandbox Code Playgroud)

当我运行它并且python崩溃时,我得到这个警告

objc[67570]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[67570]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[67567]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[67567]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[67568]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[67568]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[67569]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[67569]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[67571]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[67571]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[67572]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[67572]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Run Code Online (Sandbox Code Playgroud)

jon*_*les 101

发生此错误是因为增加了安全性以限制Mac OS High Sierra中的多线程.我知道这个答案有点晚了,但我使用以下方法解决了这个问题:

设置环境变量.bash_profile以允许在新的Mac OS High Sierra安全规则下使用多线程应用程序或脚本.

打开终端:

$ nano .bash_profile
Run Code Online (Sandbox Code Playgroud)

将以下行添加到文件末尾:

export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
Run Code Online (Sandbox Code Playgroud)

保存,退出,关闭终端并重新打开终端.检查是否已设置环境变量:

$ env
Run Code Online (Sandbox Code Playgroud)

你会看到类似于的输出:

TERM_PROGRAM=Apple_Terminal
SHELL=/bin/bash
TERM=xterm-256color
TMPDIR=/var/folders/pn/vasdlj3ojO#OOas4dasdffJq/T/
Apple_PubSub_Socket_Render=/private/tmp/com.apple.launchd.E7qLFJDSo/Render
TERM_PROGRAM_VERSION=404
TERM_SESSION_ID=NONE
OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
Run Code Online (Sandbox Code Playgroud)

您现在应该能够使用多线程运行您的python脚本.

  • 这实际上解决了它.我想跨越多个线程迭代大型pandas数据帧,并遇到了op所描述的相同问题.这个答案为我解决了这个问题.唯一的区别是我用我运行的脚本设置了env变量:`OBJC_DISABLE_INITIALIZE_FORK_SAFETY = YES python my-script.py` (6认同)
  • 我必须在前面添加“导出”:OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES (5认同)
  • 这在Mac OS Mojave上不起作用 (4认同)
  • 非常感谢!对于那些感兴趣的人,这在macOS Mojave上对我有用。 (3认同)
  • 这个环境变量解决了我在 mac (catalina) 上本地运行 ansible 的问题 (2认同)
  • 通过编辑 ~/.zshrc,这也适用于 MacOS Catalina :) (2认同)

Bra*_*iac 44

运行 MAC 和 z-shell,并在我的 .zshrc 文件中添加:

export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
Run Code Online (Sandbox Code Playgroud)

然后在命令行中:

source ~/.zshrc
Run Code Online (Sandbox Code Playgroud)

然后就成功了


Tho*_*ger 25

其他答案告诉你要设置OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES,但不要这样做!您只需在警告灯上贴上胶带即可。对于某些遗留软件,您可能需要根据具体情况进行设置,但绝对不要在您的.bash_profile!

这已在https://bugs.python.org/issue33725 (python3.8+)中修复,但最好使用

with multiprocessing.get_context("spawn").Pool() as pool:
    pool.map(annotate,img_urls)
Run Code Online (Sandbox Code Playgroud)


小智 8

OBJC_DISABLE_INITIALIZE_FORK_SAFETY = YES解决方案对我不起作用。另一个可能的解决方案是no_proxy = *在脚本环境中进行设置,如此处所述

除了其他人提到的原因之外,此错误消息也可能与网络有关。我的脚本有一个 TCP 服务器。我什至不使用池,只是os.fork用于multiprocessing.Queue消息传递。在我添加队列之前,叉子工作得很好。

在我的情况下,设置 no_proxy 本身修复了它。如果您的脚本具有网络组件,请尝试此修复 - 也许与OBJC_DISABLE_INITIALIZE_FORK_SAFETY.