如何从Python迭代器提供子进程的标准输入?

Rya*_*son 8 python io subprocess

我正在尝试使用subprocessPython中的模块与读取标准输入并以流式方式写入标准输出的进程进行通信.我希望从生成输入的迭代器获取子进程读取行,然后从子进程读取输出行.输入和输出线之间可能没有一对一的对应关系.如何从返回字符串的任意迭代器中提供子进程?

下面是一些示例代码,它给出了一个简单的测试用例,以及我尝试过的某些方法因某些原因而无法正常工作:

#!/usr/bin/python
from subprocess import *
# A really big iterator
input_iterator = ("hello %s\n" % x for x in xrange(100000000))

# I thought that stdin could be any iterable, but it actually wants a
# filehandle, so this fails with an error.
subproc = Popen("cat", stdin=input_iterator, stdout=PIPE)

# This works, but it first sends *all* the input at once, then returns
# *all* the output as a string, rather than giving me an iterator over
# the output. This uses up all my memory, because the input is several
# hundred million lines.
subproc = Popen("cat", stdin=PIPE, stdout=PIPE)
output, error = subproc.communicate("".join(input_iterator))
output_lines = output.split("\n")
Run Code Online (Sandbox Code Playgroud)

那么当我逐行从stdout读取时,如何逐行读取迭代器中的子进程呢?

Rya*_*son 5

简单的方法似乎是从子进程分叉并提供输入句柄.任何人都可以详细说明这样做的任何可能的缺点吗?或者是否有python模块,使它更容易和更安全?

#!/usr/bin/python
from subprocess import *
import os

def fork_and_input(input, handle):
    """Send input to handle in a child process."""
    # Make sure input is iterable before forking
    input = iter(input)
    if os.fork():
        # Parent
        handle.close()
    else:
        # Child
        try:
            handle.writelines(input)
            handle.close()
        # An IOError here means some *other* part of the program
        # crashed, so don't complain here.
        except IOError:
            pass
        os._exit()

# A really big iterator
input_iterator = ("hello %s\n" % x for x in xrange(100000000))

subproc = Popen("cat", stdin=PIPE, stdout=PIPE)
fork_and_input(input_iterator, subproc.stdin)

for line in subproc.stdout:
    print line,
Run Code Online (Sandbox Code Playgroud)