当预期数字列表时,Python子进程communic()产生None

Question

当预期数字列表时,Python子进程communic()产生None

当我运行以下代码时

from subprocess import call, check_output, Popen, PIPE

gr = Popen(["grep", "'^>'", myfile], stdout=PIPE)
sd = Popen(["sed", "s/.*len=//"], stdin=gr.stdout)
gr.stdout.close()
out = sd.communicate()[0]
print out

Run Code Online (Sandbox Code Playgroud)

myfile看起来像这样:

>name len=345
sometexthere
>name2 len=4523
someothertexthere
...
...

Run Code Online (Sandbox Code Playgroud)

我明白了

None

Run Code Online (Sandbox Code Playgroud)

当预期输出是数字列表时:

345
4523
...
...

Run Code Online (Sandbox Code Playgroud)

我在终端运行的相应命令是

grep "^>" myfile | sed "s/.*len=//" > outfile

Run Code Online (Sandbox Code Playgroud)

到目前为止,我已经尝试过以不同的方式进行转义和引用,例如在sed中转义斜线或为grep添加额外的引号,但是组合的可能性很大.

我也考虑过在文件中读取并编写grep和sed的Python等价物,但文件非常大(我总是可以逐行读取),它总是在基于UNIX的系统上运行,我仍然很好奇我犯错的地方.

可能是那样的

sd.communicate()[0]

Run Code Online (Sandbox Code Playgroud)

返回某种类型的对象(而不是整数列表)？

我知道在简单的情况下我可以使用check_output获取输出:

sam = check_output(["samn", "stats", myfile])

Run Code Online (Sandbox Code Playgroud)

但不确定如何让它在更复杂的情况下工作,因为东西正在被管道输送.

有哪些有效的方法可以通过子流程获得预期的结果？

Answer 1

Pad*_*ham 4

正如建议的，您需要stdout=PIPE在第二个过程中删除单引号"'^>'"：

gr = Popen(["grep", "^>", myfile], stdout=PIPE)
Popen(["sed", "s/.*len=//"], stdin=gr.stdout, stdout=PIPE)
......

Run Code Online (Sandbox Code Playgroud)

但这可以简单地使用纯 python 来完成re：

import re
r = re.compile("^\>.*len=(.*)$")
with open("test.txt") as f:
    for line in f:
        m =  r.search(line)
        if m:
            print(m.group(1))

Run Code Online (Sandbox Code Playgroud)

这会输出：

345
4523

Run Code Online (Sandbox Code Playgroud)

如果以开头的行>始终具有数字，并且数字始终位于此后的末尾len=，那么您实际上也不需要正则表达式：

with open("test.txt") as f:
    for line in f:
        if line.startswith(">"):
            print(line.rsplit("len=", 1)[1])

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，10 月前
查看次数：	388 次
最近记录：	9 年，10 月前