当Popen出错时,子进程的Popen关闭另一个线程中使用的stdout/stderr文件描述符

jan*_*and 11 python multithreading python-2.7

当我们从Python 2.7.3升级到Python 2.7.5时,大量使用subprocess.Popen()的内部库开始失败其自动化测试.该库用于线程环境.在调试问题之后,我能够创建一个简短的Python脚本,演示在失败的测试中看到的错误.

这是脚本(称为"threadedsubprocess.py"):

import time
import threading
import subprocess

def subprocesscall():
    p = subprocess.Popen(
        ['ls', '-l'],
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        )
    time.sleep(2) # simulate the Popen call takes some time to complete.
    out, err = p.communicate()
    print 'succeeding command in thread:', threading.current_thread().ident

def failingsubprocesscall():
    try:
        p = subprocess.Popen(
            ['thiscommandsurelydoesnotexist'],
            stdin=subprocess.PIPE,
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            )
    except Exception as e:
        print 'failing command:', e, 'in thread:', threading.current_thread().ident

print 'main thread is:', threading.current_thread().ident

subprocesscall_thread = threading.Thread(target=subprocesscall)
subprocesscall_thread.start()
failingsubprocesscall()
subprocesscall_thread.join()
Run Code Online (Sandbox Code Playgroud)

注意:从Python 2.7.3运行时,此脚本不会以IOError退出.从Python 2.7.5(同一个Ubuntu 12.04 64位VM)运行时,它至少失败了50%.

在Python 2.7.5上引发的错误是这样的:

/opt/python/2.7.5/bin/python ./threadedsubprocess.py 
main thread is: 139899583563520
failing command: [Errno 2] No such file or directory 139899583563520
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/opt/python/2.7.5/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/opt/python/2.7.5/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File "./threadedsubprocess.py", line 13, in subprocesscall
    out, err = p.communicate()
  File "/opt/python/2.7.5/lib/python2.7/subprocess.py", line 806, in communicate
    return self._communicate(input)
  File "/opt/python/2.7.5/lib/python2.7/subprocess.py", line 1379, in _communicate
    self.stdin.close()
IOError: [Errno 9] Bad file descriptor

close failed in file object destructor:
IOError: [Errno 9] Bad file descriptor
Run Code Online (Sandbox Code Playgroud)

将Python 2.7.3中的子进程模块与Python 2.7.5进行比较时,我看到Popen()的__init __()调用确实现在显式关闭stdin,stdout和stderr文件描述符,以防执行命令以某种方式失败.这似乎是在Python 2.7.4中应用的预期修复,以防止泄漏文件描述符(http://hg.python.org/cpython/file/ab05e7dd2788/Misc/NEWS#l629).

Python 2.7.3和Python 2.7.5之间似乎与此问题相关的差异在于Popen __init __():

@@ -671,12 +702,33 @@
          c2pread, c2pwrite,
          errread, errwrite) = self._get_handles(stdin, stdout, stderr)

-        self._execute_child(args, executable, preexec_fn, close_fds,
-                            cwd, env, universal_newlines,
-                            startupinfo, creationflags, shell,
-                            p2cread, p2cwrite,
-                            c2pread, c2pwrite,
-                            errread, errwrite)
+        try:
+            self._execute_child(args, executable, preexec_fn, close_fds,
+                                cwd, env, universal_newlines,
+                                startupinfo, creationflags, shell,
+                                p2cread, p2cwrite,
+                                c2pread, c2pwrite,
+                                errread, errwrite)
+        except Exception:
+            # Preserve original exception in case os.close raises.
+            exc_type, exc_value, exc_trace = sys.exc_info()
+
+            to_close = []
+            # Only close the pipes we created.
+            if stdin == PIPE:
+                to_close.extend((p2cread, p2cwrite))
+            if stdout == PIPE:
+                to_close.extend((c2pread, c2pwrite))
+            if stderr == PIPE:
+                to_close.extend((errread, errwrite))
+
+            for fd in to_close:
+                try:
+                    os.close(fd)
+                except EnvironmentError:
+                    pass
+
+            raise exc_type, exc_value, exc_trace
Run Code Online (Sandbox Code Playgroud)

我想我有三个问题:

1)是否真的应该在线程环境中使用subprocess.Popen,pIPE用于stdin,stdout和stderr?

2)当Popen()在其中一个线程中失败时,如何阻止stdin,stdout和stderr的文件描述符被关闭?

3)我在这里做错了吗?

nic*_*kie 7

我想回答你的问题:

  1. 是.
  2. 你不应该这样做.
  3. 没有.

错误也发生在Python 2.7.4中.

我认为这是库代码中的一个错误.如果在程序中添加锁定并确保以subprocess.Popen原子方式执行两次调用,则不会发生错误.

@@ -1,32 +1,40 @@
 import time
 import threading
 import subprocess

+lock = threading.Lock()
+
 def subprocesscall():
+    lock.acquire()
     p = subprocess.Popen(
         ['ls', '-l'],
         stdin=subprocess.PIPE,
         stdout=subprocess.PIPE,
         stderr=subprocess.PIPE,
         )
+    lock.release()
     time.sleep(2) # simulate the Popen call takes some time to complete.
     out, err = p.communicate()
     print 'succeeding command in thread:', threading.current_thread().ident

 def failingsubprocesscall():
     try:
+        lock.acquire()
         p = subprocess.Popen(
             ['thiscommandsurelydoesnotexist'],
             stdin=subprocess.PIPE,
             stdout=subprocess.PIPE,
             stderr=subprocess.PIPE,
             )
     except Exception as e:
         print 'failing command:', e, 'in thread:', threading.current_thread().ident
+    finally:
+        lock.release()
+

 print 'main thread is:', threading.current_thread().ident

 subprocesscall_thread = threading.Thread(target=subprocesscall)
 subprocesscall_thread.start()
 failingsubprocesscall()
 subprocesscall_thread.join()
Run Code Online (Sandbox Code Playgroud)

这意味着它很可能是由于某些数据竞争的实施Popen.我将冒险猜测:错误可能在执行中pipe_cloexec,被调用_get_handles,其中(在2.7.4中)是:

def pipe_cloexec(self):
    """Create a pipe with FDs set CLOEXEC."""
    # Pipes' FDs are set CLOEXEC by default because we don't want them
    # to be inherited by other subprocesses: the CLOEXEC flag is removed
    # from the child's FDs by _dup2(), between fork() and exec().
    # This is not atomic: we would need the pipe2() syscall for that.
    r, w = os.pipe()
    self._set_cloexec_flag(r)
    self._set_cloexec_flag(w)
    return r, w
Run Code Online (Sandbox Code Playgroud)

并且评论明确地警告它不是原子的......这肯定会导致数据竞争但是,如果没有实验,我不知道它是否是导致问题的原因.