我有一个无限期生成输出的程序。我想对该输出进行一秒钟的采样,然后通过管道传输到 gzip 中。我正在使用timeout
util 来限制执行,但问题是它也gzip
被杀死了。
例如:
$ /usr/bin/timeout 1 bash -c "echo asdf; sleep 5" | gzip > /tmp/foo.gz; ls -lah /tmp/foo.gz
Terminated
-rw-rw-r-- 1 haizaar haizaar 0 Jul 22 15:05 /tmp/foo.gz
Run Code Online (Sandbox Code Playgroud)
你看,gzip 命令是Terminated
,因此它的输出结果是一个空文件(由于丢失的缓冲区)
我不明白如何timeout
设法杀死读取其标准输出的进程;以及如何修复它。
即使将整个事情包装在另一个bash
结果中也是一样的:
$ bash -c '/usr/bin/timeout 1 bash -c "echo asdf; sleep 5"' | gzip > /tmp/foo.gz; ls -lah /tmp/foo.gz
Terminated
-rw-rw-r-- 1 haizaar haizaar 0 Jul 22 15:30 /tmp/foo.gz
Run Code Online (Sandbox Code Playgroud)
我可以在前面加上timeout
用setsid
,然后它的工作原理这让我觉得它的主题相关的进程组混合起来却很难,目前的情况是“设计”接受这样的事实,因为它使timeout
命令非常棘手与壳牌管道使用。
环境:
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.2 LTS"
$ bash --version
GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$ timeout --version
timeout (GNU coreutils) 8.30
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Padraig Brady.
Run Code Online (Sandbox Code Playgroud)
更新
KamilCuk 对他的 strace 分析很到位。它还解释了timeout
在另一个包装中bash
也无济于事 - 似乎 bash 有一个优化,如果它只有运行命令,它不会fork
s 而是exec
s 替换自己。但是,如果您将另一个命令添加到包装中,bash
那么它将分叉,从而创建一个新的进程组,从而限制该timeout
命令的爆炸半径。IE
bash -c 'true; /usr/bin/timeout 1 bash -c "echo asdf; sleep 5"' | gzip > /tmp/foo.gz
Run Code Online (Sandbox Code Playgroud)
(注意领先true
)
我仍然认为timeout
在管道中使用是一种黑魔法,但那是另一回事了。
$ strace -ff -e trace=setpgid,kill,exit_group,exit,execve,wait4 bash --norc --noprofile -ic "timeout -v 1 bash --norc --noprofile -c 'echo asdf ; sleep 5' | { sleep 2; echo 123; }"
execve("/usr/bin/bash", ["bash", "--norc", "--noprofile", "-ic", "timeout -v 1 bash --norc --nopro"...], 0x7ffeb8ef7ef8 /* 76 vars */) = 0
setpgid(0, 28995) = 0
strace: Process 28996 attached
[pid 28995] setpgid(28996, 28996) = 0
[pid 28996] setpgid(28996, 28996) = 0
strace: Process 28997 attached
[pid 28995] setpgid(28997, 28996) = 0
[pid 28995] wait4(-1, <unfinished ...>
[pid 28997] setpgid(28997, 28996) = 0
[pid 28996] execve("/usr/bin/timeout", ["timeout", "-v", "1", "bash", "--norc", "--noprofile", "-c", "echo asdf ; sleep 5"], 0x560da0ff57e0 /* 76 vars */strace: Process 28998 attached
) = 0
[pid 28997] wait4(-1, <unfinished ...>
[pid 28998] execve("/usr/bin/sleep", ["sleep", "2"], 0x560da0ff57e0 /* 76 vars */) = 0
[pid 28996] setpgid(0, 0) = 0
strace: Process 28999 attached
[pid 28996] wait4(28999, 0x7ffd7eb5e96c, WNOHANG, NULL) = 0
[pid 28999] execve("/usr/local/bin/bash", ["bash", "--norc", "--noprofile", "-c", "echo asdf ; sleep 5"], 0x7ffd7eb5ec10 /* 76 vars */) = -1 ENOENT (No such file or directory)
[pid 28999] execve("/usr/bin/bash", ["bash", "--norc", "--noprofile", "-c", "echo asdf ; sleep 5"], 0x7ffd7eb5ec10 /* 76 vars */) = 0
[pid 28999] execve("/usr/bin/sleep", ["sleep", "5"], 0x55a84be27270 /* 76 vars */) = 0
[pid 28996] --- SIGALRM {si_signo=SIGALRM, si_code=SI_TIMER, si_timerid=0, si_overrun=0, si_int=0, si_ptr=NULL} ---
timeout: sending signal TERM to command ‘bash’
[pid 28996] kill(28999, SIGTERM) = 0
[pid 28999] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=28996, si_uid=1000} ---
[pid 28996] kill(0, SIGTERM <unfinished ...>
[pid 28997] <... wait4 resumed>0x7ffc114a9600, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 28996] <... kill resumed>) = 0
[pid 28999] +++ killed by SIGTERM +++
[pid 28998] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=28996, si_uid=1000} ---
[pid 28997] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=28996, si_uid=1000} ---
[pid 28996] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=28996, si_uid=1000} ---
[pid 28996] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=28999, si_uid=1000, si_status=SIGTERM, si_utime=0, si_stime=0} ---
[pid 28997] +++ killed by SIGTERM +++
[pid 28995] <... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGTERM}], WSTOPPED|WCONTINUED, NULL) = 28997
[pid 28998] +++ killed by SIGTERM +++
[pid 28995] wait4(-1, <unfinished ...>
[pid 28996] kill(28999, SIGCONT) = 0
[pid 28996] kill(0, SIGCONT) = 0
[pid 28996] --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=28996, si_uid=1000} ---
[pid 28996] wait4(28999, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGTERM}], WNOHANG, NULL) = 28999
[pid 28996] exit_group(124) = ?
[pid 28996] +++ exited with 124 +++
<... wait4 resumed>[{WIFEXITED(s) && WEXITSTATUS(s) == 124}], WSTOPPED|WCONTINUED, NULL) = 28996
Terminated
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=28997, si_uid=1000, si_status=SIGTERM, si_utime=0, si_stime=0} ---
wait4(-1, 0x7ffc114a9710, WNOHANG|WSTOPPED|WCONTINUED, NULL) = -1 ECHILD (No child processes)
setpgid(0, 28992) = 0
exit_group(143) = ?
+++ exited with 143 +++
Run Code Online (Sandbox Code Playgroud)
所以发生的事情是timeout
试图变得聪明并杀死整个进程组。据我了解,情况是这样的:
setpgid(28996, 28996)
setpgid(0, 0)
timeout
杀死整个进程组kill(0, SIGTERM <unfinished ...>
您可以使用命令 grouping 使 bash 为左侧启动一个新的进程组{ ... }
。
你可以使用timeout --foreground
,但此后timeout
将只杀死前台进程。所以虽然bash
会死,gzip
进程仍然会sleep 5
在后台等待运行,因为它会打开stdin
它。
猜测(也来自提交消息)我认为这可能是意图,以便timeout
可以杀死整个管道,就像它是内置的魔法外壳一样。
此外,启用和禁用作业控制之间的行为不同,因此交互式和非交互式 shell 之间的行为也不同。
归档时间: |
|
查看次数: |
108 次 |
最近记录: |