mar*_*ark 49 bash scripting job-control
为了最大化CPU使用率(我在EC2中的Debian Lenny上运行)我有一个简单的脚本来并行启动作业:
#!/bin/bash
for i in apache-200901*.log; do echo "Processing $i ..."; do_something_important; done &
for i in apache-200902*.log; do echo "Processing $i ..."; do_something_important; done &
for i in apache-200903*.log; do echo "Processing $i ..."; do_something_important; done &
for i in apache-200904*.log; do echo "Processing $i ..."; do_something_important; done &
...
Run Code Online (Sandbox Code Playgroud)
我对这个工作解决方案非常满意,但是我无法弄清楚如何编写进一步的代码,只有在所有循环完成后才执行.
有没有办法控制这个?
edu*_*ffy 78
有一个bash内置的命令.
wait [n ...]
Wait for each specified process and return its termination sta?
tus. Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job’s pipeline are
waited for. If n is not given, all currently active child pro?
cesses are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
Run Code Online (Sandbox Code Playgroud)
Ole*_*nge 27
使用GNU Parallel将使您的脚本更短,可能更高效:
parallel 'echo "Processing "{}" ..."; do_something_important {}' ::: apache-*.log
Run Code Online (Sandbox Code Playgroud)
这将为每个CPU核心运行一个作业,并继续执行此操作,直到处理完所有文件.
您的解决方案基本上会在运行之前将作业分成组.这里有4组32个职位:

GNU Parallel会在完成后生成一个新进程 - 保持CPU处于活动状态,从而节省时间:

了解更多:
一个最小的例子wait $(jobs -p):
for i in {1..3}
do
(echo "process $i started" && sleep 5 && echo "process $i finished")&
done
sleep 0.1 # For sequential output
echo "Waiting for processes to finish"
wait $(jobs -p)
echo "All processes finished"
Run Code Online (Sandbox Code Playgroud)
示例输出:
process 1 started
process 2 started
process 3 started
Waiting for processes to finish
process 2 finished
process 1 finished
process 3 finished
All processes finished
Run Code Online (Sandbox Code Playgroud)
我最近不得不这样做,最终得到以下解决方案:
while true; do
wait -n || {
code="$?"
([[ $code = "127" ]] && exit 0 || exit "$code")
break
}
done;
Run Code Online (Sandbox Code Playgroud)
运作方式如下:
wait -n(可能有许多)后台作业之一退出后立即退出。它始终求值为true,并且循环一直进行到:
127:上一个后台作业成功退出。在这种情况下,我们将忽略退出代码,并退出代码为0的子外壳。使用set -e,这将确保脚本将尽早终止并传递任何失败的后台作业的退出代码。
| 归档时间: |
|
| 查看次数: |
29084 次 |
| 最近记录: |