如何编写一个进程池bash shell

Sil*_*ili 33 bash shell sh multiprocessing multiprocess

我有10个以上的任务要执行,系统限制最多可以同时运行4个任务.

我的任务可以像:myprog taskname一样启动

如何编写bash shell脚本来运行这些任务.最重要的是,当一个任务完成时,脚本可以立即启动另一个任务,使运行任务计数始终保持为4.

小智 48

用途xargs:

xargs -P <maximun-number-of-process-at-a-time> -n <arguments per process> <commnad>
Run Code Online (Sandbox Code Playgroud)

细节在这里.


小智 28

我偶然发现这个线程,同时寻找到写我自己的进程池和特别喜欢布兰登霍斯利的解决方案,但我不能得到的信号工作的权利,所以我把灵感来自于Apache和决定尝试用FIFO作为预分叉模型我的工作队列.

以下函数是工作进程在分叉时运行的函数.

# \brief the worker function that is called when we fork off worker processes
# \param[in] id  the worker ID
# \param[in] job_queue  the fifo to read jobs from
# \param[in] result_log  the temporary log file to write exit codes to
function _job_pool_worker()
{
    local id=$1
    local job_queue=$2
    local result_log=$3
    local line=

    exec 7<> ${job_queue}
    while [[ "${line}" != "${job_pool_end_of_jobs}" && -e "${job_queue}" ]]; do
        # workers block on the exclusive lock to read the job queue
        flock --exclusive 7
        read line <${job_queue}
        flock --unlock 7
        # the worker should exit if it sees the end-of-job marker or run the
        # job otherwise and save its exit code to the result log.
        if [[ "${line}" == "${job_pool_end_of_jobs}" ]]; then
            # write it one more time for the next sibling so that everyone
            # will know we are exiting.
            echo "${line}" >&7
        else
            _job_pool_echo "### _job_pool_worker-${id}: ${line}"
            # run the job
            { ${line} ; } 
            # now check the exit code and prepend "ERROR" to the result log entry
            # which we will use to count errors and then strip out later.
            local result=$?
            local status=
            if [[ "${result}" != "0" ]]; then
                status=ERROR
            fi  
            # now write the error to the log, making sure multiple processes
            # don't trample over each other.
            exec 8<> ${result_log}
            flock --exclusive 8
            echo "${status}job_pool: exited ${result}: ${line}" >> ${result_log}
            flock --unlock 8
            exec 8>&-
            _job_pool_echo "### _job_pool_worker-${id}: exited ${result}: ${line}"
        fi  
    done
    exec 7>&-
}
Run Code Online (Sandbox Code Playgroud)

您可以在Github上获得我的解决方案的副本.这是一个使用我的实现的示例程序.

#!/bin/bash

. job_pool.sh

function foobar()
{
    # do something
    true
}   

# initialize the job pool to allow 3 parallel jobs and echo commands
job_pool_init 3 0

# run jobs
job_pool_run sleep 1
job_pool_run sleep 2
job_pool_run sleep 3
job_pool_run foobar
job_pool_run foobar
job_pool_run /bin/false

# wait until all jobs complete before continuing
job_pool_wait

# more jobs
job_pool_run /bin/false
job_pool_run sleep 1
job_pool_run sleep 2
job_pool_run foobar

# don't forget to shut down the job pool
job_pool_shutdown

# check the $job_pool_nerrors for the number of jobs that exited non-zero
echo "job_pool_nerrors: ${job_pool_nerrors}"
Run Code Online (Sandbox Code Playgroud)

希望这可以帮助!


Ole*_*nge 15

使用GNU Parallel,您可以:

cat tasks | parallel -j4 myprog
Run Code Online (Sandbox Code Playgroud)

如果你有4个核心,你甚至可以这样做:

cat tasks | parallel myprog
Run Code Online (Sandbox Code Playgroud)

来自http://git.savannah.gnu.org/cgit/parallel.git/tree/README:

完全安装

完全安装GNU Parallel非常简单:

./configure && make && make install
Run Code Online (Sandbox Code Playgroud)

个人安装

如果你不是root用户,可以在你的路径中添加〜/ bin并安装在〜/ bin和〜/ share中:

./configure --prefix=$HOME && make && make install
Run Code Online (Sandbox Code Playgroud)

或者,如果您的系统缺少'make',您只需将src/parallel src/sem src/niceload src/sql复制到路径中的目录即可.

最小的安装

如果你只需要并行并且没有安装'make'(可能系统是旧的或Microsoft Windows):

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
mv parallel sem dir-in-your-$PATH/bin/
Run Code Online (Sandbox Code Playgroud)

测试安装

在此之后你应该能够做到:

parallel -j0 ping -nc 3 ::: foss.org.my gnu.org freenetproject.org
Run Code Online (Sandbox Code Playgroud)

这将并行向3个不同的主机发送3个ping数据包,并在完成后打印输出.

观看介绍视频以获得快速介绍:https: //www.youtube.com/playlist?list = PL284C9FF2488BC6D1


Zhe*_*Mao 5

我建议编写四个脚本,每个脚本依次执行一定数量的任务。然后编写另一个脚本,并行启动四个脚本。例如,如果您有脚本、script1.sh、script2.sh、script3.sh 和 script4.sh,那么您可以像这样拥有一个名为 headscript.sh 的脚本。

#!/bin/sh
./script1.sh & 
./script2.sh & 
./script3.sh & 
./script4.sh &
Run Code Online (Sandbox Code Playgroud)

  • 这是最简单的解决方案,但如果四个脚本的工作负载大致相同,则效果最佳。如果作业长度不可预测,两个脚本可能已经完成,但另外两个脚本可能还剩下很多任务。其他解决方案可以正确地重新分配工作负载。 (3认同)