使用Celery在部分任务中使用位置参数的链组

eri*_*rip 5 python asynchronous celery celery-task python-3.x

我正在编写一个应用程序,它将异步执行一组多个同步任务链.

换句话说,我可能有foo(a,b,c) -> boo(a,b,c)一些列表的管道bs.

我的理解是foo(a,b,c) | boo(a,b,c)为此列表中的每个b 创建一个链.然后这些链形成一个芹菜组,可以异步应用.

我的代码如下:

my_app.py

#!/usr/bin/env python3

import functools
import time

from celery import chain, group, Celery
from celery.utils.log import get_task_logger

logger = get_task_logger(__name__)

app = Celery("my_app", broker='redis://localhost:6379/0', backend='redis://localhost:6379/0')

@app.task
def foo(a, b, c):
    logger.info("foo from {0}!".format(b))
    return b

@app.task
def boo(a, b, c):
    logger.info("boo from {0}!".format(b))
    return b

def break_up_tasks(tasks):
    try:
        first_task, *remaining_tasks = tasks
    except ValueError as e:
        first_task, remaining_tasks = [], []
    return first_task, remaining_tasks

def do_tasks(a, bs, c, opts):
    tasks = [foo, boo]

    # There should be an option for each task
    if len(opts) != len(tasks):
        raise ValueError("There should be {0} provided options".format(len(tasks)))

    # Create a list of tasks that should be included per the list of options' boolean values
    tasks = [task for opt, task in zip(opts, tasks) if opt]

    first_task, remaining_tasks = break_up_tasks(tasks)

    # If there are no tasks, we're done.
    if not first_task: return

    chains = (
        functools.reduce(
            # `a` should be provided by `apply_async`'s `args` kwarg
            # `b` should be provided by previous partials in chain
            lambda x, y: x | y.s(c),
            remaining_tasks, first_task.s(a, b, c)
        ) for b in bs
    )

    g = group(*chains)
    res = g.apply_async(args=(a,), queue="default")
    print("Applied async... waiting for termination.")

    total_tasks = len(tasks)

    while not res.ready():
        print("Waiting... {0}/{1} tasks complete".format(res.completed_count(), total_tasks))
        time.sleep(1)

if __name__ == "__main__":
    a = "whatever"
    bs = ["hello", "world"]
    c = "baz"

    opts = [
        # do "foo"
        True,
        # do "boo"
        True
    ]

    do_tasks(a, bs, c, opts)
Run Code Online (Sandbox Code Playgroud)

跑芹菜

celery worker -A my_app -l info -c 5 -Q default
Run Code Online (Sandbox Code Playgroud)

但是,我发现,当我运行上述操作时,我的服务器客户端运行无限循环,因为boo缺少一个参数:

TypeError: boo() missing 1 required positional argument: 'c'

我的理解是,apply_async将为args每个链提供kwarg,并且链中的先前链接将为后续链接提供其返回值.

为什么boo没有正确接受论点?我确信这些任务写得不好,因为这是我第一次涉足Celery.如果您有其他建议,我很乐意接受他们.

小智 4

调试完你的代码后(我也是 Celery 的新手!:))我了解到每个链式函数都会将第一个参数替换为前一个链式函数调用的结果 - 所以话虽如此,我相信你的解决方案问题是在reduce中向ys添加一个缺失的参数(第二个):

chains = (
    functools.reduce(
        # `a` should be provided by `apply_async`'s `args` kwarg
        # `b` should be provided by previous partials in chain
        lambda x, y: x | y.s(b,c), # <- here is the 'new guy'
        remaining_tasks, first_task.s(a, b, c)
    ) for b in bs
)
Run Code Online (Sandbox Code Playgroud)

希望能帮助到你。