使主管正确停止芹菜工人

Mit*_*ril 4 python celery-task supervisord django-celery django-supervisor

我在使用芹菜时遇到了很多奇怪的事情。比如,我更新tasks.py,supervisorctl reload(重启),但是tasks错了。有些任务似乎消失了等等。
今天我发现,因为supervisorctl stop all不能阻止所有的芹菜工人。并且只有 kill -9 'pgrep python' 可以将它们全部杀死。

情况:

    root@ubuntu12:/data/www/article_fetcher# supervisorctl
    celery_beat                      RUNNING    pid 29597, uptime 0:52:18
    celery_worker1                   RUNNING    pid 29556, uptime 0:52:20
    celery_worker2                   RUNNING    pid 29570, uptime 0:52:19
    celery_worker3                   RUNNING    pid 29557, uptime 0:52:20
    celery_worker4                   RUNNING    pid 29586, uptime 0:52:18
    uwsgi                            RUNNING    pid 29604, uptime 0:52:18
    supervisor> stop all
    celery_beat: stopped
    celery_worker2: stopped
    celery_worker4: stopped
    celery_worker3: stopped
    uwsgi: stopped
    celery_worker1: stopped
    supervisor> status
    celery_beat                      STOPPED    Aug 04 11:05 AM
    celery_worker1                   STOPPED    Aug 04 11:05 AM
    celery_worker2                   STOPPED    Aug 04 11:05 AM
    celery_worker3                   STOPPED    Aug 04 11:05 AM
    celery_worker4                   STOPPED    Aug 04 11:05 AM
    uwsgi                            STOPPED    Aug 04 11:05 AM
Run Code Online (Sandbox Code Playgroud)

流程:

root@ubuntu12:~# ps -aux|grep 'python'
Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html
root      8683  0.0  0.1  61420 11768 ?        Ss   Aug03   0:27 /usr/bin/python /usr/bin/supervisord
root     29310  0.1  0.1  57120 11344 pts/2    S+   11:05   0:00 /usr/bin/python /usr/bin/supervisorctl
nobody   29556  2.2  0.5 132484 45988 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody   29557  2.2  0.5 132480 45996 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29570  2.4  0.5 132740 45996 ?        S    11:06   0:00 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
nobody   29571 26.9  1.4 217688 115804 ?       R    11:06   0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29572 33.7  0.7 158396 59808 ?        R    11:06   0:12 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
nobody   29573 29.6  1.4 215176 115928 ?       R    11:06   0:10 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W1 -Ofair --app=celery_worker:app
nobody   29574 27.2  1.4 218244 118180 ?       R    11:06   0:09 /data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W3 -Ofair --app=celery_worker:app
......
......
......
Run Code Online (Sandbox Code Playgroud)

我发现了这个问题:停止主管不会停止芹菜工人,但它在问不同的事情,接受的答案supervisorctl stop all实际上不起作用。所以我决定找到正确的方法。

Mit*_*ril 6

我查看主管文档并发现:

杀戮组

如果为 true,则在向程序发送 SIGKILL 以终止它时,将其发送到其整个进程组,同时照顾其子进程,这对于使用多处理的 Python 程序很有用。

默认值:假

要求:否。

引入:3.0a11

然后我认为每个工人创建 4 个子进程(由 cpu 内核)成为一个进程组,这就是为什么supervisorctl stop all不起作用。
所以我添加killasgroup到 supervisord.conf:

    [program:celery_worker1]
    ; Set full path to celery program if using virtualenv

    directory=/data/www/article_fetcher

    command=/data/www/article_fetcher/venv/bin/python /data/www/article_fetcher/manage.py celery worker -n W2 -Ofair --app=celery_worker:app
    user=nobody
    numprocs=1
    stdout_logfile=/data/www/article_fetcher/logs/celery.log
    stderr_logfile=/data/www/article_fetcher/logs/celery.log
    autostart=true
    autorestart=true
    startsecs=5
    killasgroup=true

    .....
    .....
Run Code Online (Sandbox Code Playgroud)

然后supervisorctl stop all真的停止芹菜工人!很好~