Python在保留顺序的同时分别从子进程stdout和stderr读取

Question

Python在保留顺序的同时分别从子进程stdout和stderr读取

Luk*_*pan 29 python subprocess stdout stderr

我有一个python子进程,我正在尝试从中读取输出和错误流.目前我有它的工作,但我只能在读完stderr之后阅读stdout.这是它的样子:

process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout_iterator = iter(process.stdout.readline, b"")
stderr_iterator = iter(process.stderr.readline, b"")

for line in stdout_iterator:
    # Do stuff with line
    print line

for line in stderr_iterator:
    # Do stuff with line
    print line

Run Code Online (Sandbox Code Playgroud)

如您所见,stderrfor循环在stdout循环完成之前无法启动.如何修改它以便能够以正确的顺序读取这两行？

澄清:我仍然需要能够判断一条线是否来自stdout或者stderr因为我的代码中它们的处理方式不同.

Answer 1

jfs*_*jfs 22

如果子进程在stderr上产生足够的输出(我的Linux机器上大约100KB),则问题中的代码可能会死锁.

有一种communicate()方法允许分别从stdout和stderr读取:

from subprocess import Popen, PIPE

process = Popen(command, stdout=PIPE, stderr=PIPE)
output, err = process.communicate()

Run Code Online (Sandbox Code Playgroud)

如果您需要在子进程仍在运行时读取流,则可移植解决方案是使用线程(未测试):

from subprocess import Popen, PIPE
from threading import Thread
from Queue import Queue # Python 2

def reader(pipe, queue):
    try:
        with pipe:
            for line in iter(pipe.readline, b''):
                queue.put((pipe, line))
    finally:
        queue.put(None)

process = Popen(command, stdout=PIPE, stderr=PIPE, bufsize=1)
q = Queue()
Thread(target=reader, args=[process.stdout, q]).start()
Thread(target=reader, args=[process.stderr, q]).start()
for _ in range(2):
    for source, line in iter(q.get, None):
        print "%s: %s" % (source, line),

Run Code Online (Sandbox Code Playgroud)

看到:

@LukeSapan由于两个FD彼此独立,一个消息可能会被延迟,所以在这种情况下没有"之前"和"之后"的概念...... (3认同)
不幸的是，这个答案没有保留来自“stdout”和“stderr”的行的顺序。它非常接近我需要的东西！对我来说，知道“stderr”行何时相对于“stdout”行进行管道传输对我来说很重要。 (2认同)
@LukeSapan:我认为没有办法保留订单*和*分别捕获stdout/stderr.你可以很容易地得到一个或另一个.在Unix上你可以尝试一个选择循环,可以使效果不那么明显.它开始看起来像[XY问题](http://meta.stackexchange.com/a/66378/137096):编辑你的问题并提供你正在尝试做的事情的背景. (2认同)
@LukeSapan为什么保留订单？只需添加时间戳并在末尾排序即可。 (2认同)

Answer 2

Dee*_*dav 10

这适用于 Python3 (3.6)：

    p = subprocess.Popen(cmd, stdout=subprocess.PIPE, 
                         stderr=subprocess.PIPE, universal_newlines=True)
    # Read both stdout and stderr simultaneously
    sel = selectors.DefaultSelector()
    sel.register(p.stdout, selectors.EVENT_READ)
    sel.register(p.stderr, selectors.EVENT_READ)
    ok = True
    while ok:
        for key, val1 in sel.select():
            line = key.fileobj.readline()
            if not line:
                ok = False
                break
            if key.fileobj is p.stdout:
                print(f"STDOUT: {line}", end="")
            else:
                print(f"STDERR: {line}", end="", file=sys.stderr)

Run Code Online (Sandbox Code Playgroud)

Answer 3

Dev*_*wal 8

Here's a solution based on selectors, but one that preserves order, and streams variable-length characters (even single chars).

The trick is to use read1(), instead of read().

import selectors
import subprocess
import sys

p = subprocess.Popen(
    ["python", "random_out.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
)

sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)

while True:
    for key, _ in sel.select():
        data = key.fileobj.read1().decode()
        if not data:
            exit()
        if key.fileobj is p.stdout:
            print(data, end="")
        else:
            print(data, end="", file=sys.stderr)

Run Code Online (Sandbox Code Playgroud)

If you want a test program, use this.

import sys
from time import sleep


for i in range(10):
    print(f" x{i} ", file=sys.stderr, end="")
    sleep(0.1)
    print(f" y{i} ", end="")
    sleep(0.1)

Run Code Online (Sandbox Code Playgroud)

Answer 4

小智 6

写入后，进程将数据写入不同管道的顺序将丢失。

您无法判断在stderr之前是否已写入stdout。

您可以尝试在数据可用时以非阻塞方式同时从多个文件描述符中同时读取数据，但这只会最大程度地降低顺序不正确的可能性。

该程序应证明这一点：

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import select
import subprocess

testapps={
    'slow': '''
import os
import time
os.write(1, 'aaa')
time.sleep(0.01)
os.write(2, 'bbb')
time.sleep(0.01)
os.write(1, 'ccc')
''',
    'fast': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbb')
os.write(1, 'ccc')
''',
    'fast2': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbbbbbbbbbbbbbb')
os.write(1, 'ccc')
'''
}

def readfds(fds, maxread):
    while True:
        fdsin, _, _ = select.select(fds,[],[])
        for fd in fdsin:
            s = os.read(fd, maxread)
            if len(s) == 0:
                fds.remove(fd)
                continue
            yield fd, s
        if fds == []:
            break

def readfromapp(app, rounds=10, maxread=1024):
    f=open('testapp.py', 'w')
    f.write(testapps[app])
    f.close()

    results={}
    for i in range(0, rounds):
        p = subprocess.Popen(['python', 'testapp.py'], stdout=subprocess.PIPE
                                                     , stderr=subprocess.PIPE)
        data=''
        for (fd, s) in readfds([p.stdout.fileno(), p.stderr.fileno()], maxread):
            data = data + s
        results[data] = results[data] + 1 if data in results else 1

    print 'running %i rounds %s with maxread=%i' % (rounds, app, maxread)
    results = sorted(results.items(), key=lambda (k,v): k, reverse=False)
    for data, count in results:
        print '%03i x %s' % (count, data)


print
print "=> if output is produced slowly this should work as whished"
print "   and should return: aaabbbccc"
readfromapp('slow',  rounds=100, maxread=1024)

print
print "=> now mostly aaacccbbb is returnd, not as it should be"
readfromapp('fast',  rounds=100, maxread=1024)

print
print "=> you could try to read data one by one, and return"
print "   e.g. a whole line only when LF is read"
print "   (b's should be finished before c's)"
readfromapp('fast',  rounds=100, maxread=1)

print
print "=> but even this won't work ..."
readfromapp('fast2', rounds=100, maxread=1)

Run Code Online (Sandbox Code Playgroud)

并输出如下内容：

=> if output is produced slowly this should work as whished
   and should return: aaabbbccc
running 100 rounds slow with maxread=1024
100 x aaabbbccc

=> now mostly aaacccbbb is returnd, not as it should be
running 100 rounds fast with maxread=1024
006 x aaabbbccc
094 x aaacccbbb

=> you could try to read data one by one, and return
   e.g. a whole line only when LF is read
   (b's should be finished before c's)
running 100 rounds fast with maxread=1
003 x aaabbbccc
003 x aababcbcc
094 x abababccc

=> but even this won't work ...
running 100 rounds fast2 with maxread=1
003 x aaabbbbbbbbbbbbbbbccc
001 x aaacbcbcbbbbbbbbbbbbb
008 x aababcbcbcbbbbbbbbbbb
088 x abababcbcbcbbbbbbbbbb

Run Code Online (Sandbox Code Playgroud)

Answer 5

Ita*_*nco 5

来自https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module

如果您希望捕获两个流并将其合并为一个，请使用 stdout=PIPE 和 stderr=STDOUT 而不是 capture_output。

所以最简单的解决方案是：

process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout_iterator = iter(process.stdout.readline, b"")

for line in stdout_iterator:
    # Do stuff with line
    print line

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，10 月前
查看次数：	20936 次
最近记录：	6 年，11 月前