“读取器”进程死亡后,multiprocessing.Queue死锁

Mic*_*ael 5 python queue multiprocessing python-2.7

我一直在使用多处理程序包,并注意到在以下情况下,队列可能会死锁以进行读取:

  1. “阅读器”进程正在使用get with timeout > 0:

    self.queue.get(timeout=3)
    
    Run Code Online (Sandbox Code Playgroud)
  2. “读者”模而获取因阻塞超时

之后,该队列将永远锁定。

演示问题的应用程序

我创建两个子进程“ Worker”(放入队列)和“ Receiver”(从队列获取)。父进程还会定期检查他的孩子是否还活着,并在需要时开始新的孩子。

self.queue.get(timeout=3)
Run Code Online (Sandbox Code Playgroud)

ps中树的处理方式

bash
 \_ python queuetest.py
     \_ Worker
     \_ Receiver
Run Code Online (Sandbox Code Playgroud)

控制台输出

$ python queuetest.py
Worker: putting msg, Queue size: ~0
<<< `msg from Worker`, queue rlock: <Lock(owner=None)>
Worker: putting msg, Queue size: ~0
<<< `msg from Worker`, queue rlock: <Lock(owner=None)>
Restarting receiver                        <-- killed Receiver with SIGTERM
Worker: putting msg, Queue size: ~0
Worker: putting msg, Queue size: ~1
Worker: putting msg, Queue size: ~2
<<< EMPTY, Queue rlock: <Lock(owner=SomeOtherProcess)>
Worker: putting msg, Queue size: ~3
Worker: putting msg, Queue size: ~4
Worker: putting msg, Queue size: ~5
<<< EMPTY, Queue rlock: <Lock(owner=SomeOtherProcess)>
Worker: putting msg, Queue size: ~6
Worker: putting msg, Queue size: ~7
Run Code Online (Sandbox Code Playgroud)

有什么办法可以绕过这个?结合使用get_nowait和sleep似乎是一种解决方法,但是它不会“随它来”读取数据。

系统信息

$ uname -sr
Linux 3.11.8-200.fc19.x86_64

$ python -V
Python 2.7.5

In [3]: multiprocessing.__version__
Out[3]: '0.70a1'
Run Code Online (Sandbox Code Playgroud)

“就可以了”的解决方案

在写这个问题时,我对Receiver类做了一些愚蠢的修改:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import multiprocessing
import procname
import time

class Receiver(multiprocessing.Process):
    ''' Reads from queue with 3 secs timeout '''

    def __init__(self, queue):
        multiprocessing.Process.__init__(self)
        self.queue = queue

    def run(self):
        procname.setprocname('Receiver')
        while True:
            try:
                msg = self.queue.get(timeout=3)
                print '<<< `{}`, queue rlock: {}'.format(
                    msg, self.queue._rlock)
            except multiprocessing.queues.Empty:
                print '<<< EMPTY, Queue rlock: {}'.format(
                    self.queue._rlock)
                pass


class Worker(multiprocessing.Process):
    ''' Puts into queue with 1 sec sleep '''

    def __init__(self, queue):
        multiprocessing.Process.__init__(self)
        self.queue = queue

    def run(self):
        procname.setprocname('Worker')
        while True:
            time.sleep(1)
            print 'Worker: putting msg, Queue size: ~{}'.format(
                self.queue.qsize())
            self.queue.put('msg from Worker')


if __name__ == '__main__':
    queue = multiprocessing.Queue()

    worker = Worker(queue)
    worker.start()

    receiver = Receiver(queue)
    receiver.start()

    while True:
        time.sleep(1)
        if not worker.is_alive():
            print 'Restarting worker'
            worker = Worker(queue)
            worker.start()
        if not receiver.is_alive():
            print 'Restarting receiver'
            receiver = Receiver(queue)
            receiver.start()
Run Code Online (Sandbox Code Playgroud)

但这对我来说似乎不是很好。