在单独的线程中启动异步事件循环并使用队列项

use*_*734 1 python queue concurrency task python-asyncio

我正在编写一个Python程序,它同时运行从队列中获取的任务,以学习 asyncio.

通过与主线程(在 REPL 内)交互,项目将被放入队列中。每当一个任务被放入队列时,它应该立即被消耗并执行。我的方法是启动一个单独的线程并将队列传递到该线程内的事件循环。

这些任务正在运行,但只是按顺序运行,我不清楚如何同时运行这些任务。我的尝试如下:

import asyncio
import time
import queue
import threading

def do_it(task_queue):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        status = f'{clock()} {name}_{total_time}:'
        print(status, 'START')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(status, 'processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(status, 'DONE.')

    async def main():
        while True:
            item = task_queue.get()
            if item == _sentinel:
                break
            await asyncio.create_task(process(*item))

    print('event loop start')
    asyncio.run(main())
    print('event loop end')


if __name__ == '__main__':
    tasks = queue.Queue()
    th = threading.Thread(target=do_it, args=(tasks,))
    th.start()

    tasks.put(('abc', 5))
    tasks.put(('def', 3))
Run Code Online (Sandbox Code Playgroud)

任何指导我同时运行这些任务的建议将不胜感激!
谢谢

更新
谢谢 Frank Yellin 和 cynthi8!我根据你的建议改造了 main() :

  • await之前删除asyncio.create_task- 固定并发
  • 添加了 wait while 循环,以便 main 不会过早返回
  • 使用Queue.get()的非阻塞模式

该程序现在按预期运行

UPDATE 2
user4815162342 提供了进一步的改进,我在下面注释了他的建议。

'''
Starts auxiliary thread which establishes a queue and consumes tasks within a
queue.
    
Allow enqueueing of tasks from within __main__ and termination of aux thread
'''
import asyncio
import time
import threading
import functools

def do_it(started):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        print(f'{clock()} {name}_{total_time}:', 'Started.')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(f'{clock()} {name}_{total_time}:', 'Processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(f'{clock()} {name}_{total_time}:', 'Done.')

    async def main():
        # get_running_loop() get the running event loop in the current OS thread
        # out to __main__ thread
        started.loop = asyncio.get_running_loop()
        started.queue = task_queue = asyncio.Queue()
        started.set()
        while True:
            item = await task_queue.get()
            if item == _sentinel:
                # task_done is used to tell join when the work in the queue is 
                # actually finished. A queue length of zero does not mean work
                # is complete.
                task_queue.task_done()
                break
            task = asyncio.create_task(process(*item))
            # Add a callback to be run when the Task is done.
            # Indicate that a formerly enqueued task is complete. Used by queue 
            # consumer threads. For each get() used to fetch a task, a 
            # subsequent call to task_done() tells the queue that the processing
            # on the task is complete.
            task.add_done_callback(lambda _: task_queue.task_done())            

        # keep loop going until all the work has completed
        # When the count of unfinished tasks drops to zero, join() unblocks.
        await task_queue.join()

    print('event loop start')
    asyncio.run(main())
    print('event loop end')

if __name__ == '__main__':
    # started Event is used for communication with thread th
    started = threading.Event()
    th = threading.Thread(target=do_it, args=(started,))
    th.start()
    # started.wait() blocks until started.set(), ensuring that the tasks and
    # loop variables are available from the event loop thread
    started.wait()
    tasks, loop = started.queue, started.loop

    # call_soon schedules the callback callback to be called with args arguments
    # at the next iteration of the event loop.
    # call_soon_threadsafe is required to schedule callbacks from another thread 
    
    # put_nowait enqueues items in non-blocking fashion, == put(block=False)
    loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
    loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
    loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')
Run Code Online (Sandbox Code Playgroud)

use*_*342 6

正如其他人指出的那样,您的代码的问题在于它使用了一个阻塞队列,该队列在等待下一个项目时会停止事件循环。然而,所提出的解决方案的问题在于它会引入延迟,因为它必须偶尔休眠以允许其他任务运行。除了引入延迟之外,它还可以防止程序进入睡眠状态,即使队列中没有项目也是如此。

另一种方法是切换到专为与 asyncio 一起使用而设计的 asyncio 队列。该队列必须在运行循环内创建,因此您不能将其传递给do_it,您必须检索它。此外,由于它是一个 asyncio 原语,因此put必须调用它的方法call_soon_threadsafe以确保事件循环注意到它。

最后一个问题是您的main()函数使用另一个繁忙循环来等待所有任务完成。可以通过使用 来避免这种情况Queue.join,它是专门为此用例设计的。

以下是您的代码,经过调整以合并上述所有建议,并且该process功能与原始代码保持不变:

import asyncio
import time
import threading

def do_it(started):
    '''Process tasks in the queue until the sentinel value is received'''
    _sentinel = 'STOP'

    def clock():
        return time.strftime("%X")

    async def process(name, total_time):
        status = f'{clock()} {name}_{total_time}:'
        print(status, 'START')
        current_time = time.time()
        end_time = current_time + total_time
        while current_time < end_time:
            print(status, 'processing...')
            await asyncio.sleep(1)
            current_time = time.time()
        print(status, 'DONE.')

    async def main():
        started.loop = asyncio.get_running_loop()
        started.queue = task_queue = asyncio.Queue()
        started.set()
        while True:
            item = await task_queue.get()
            if item == _sentinel:
                task_queue.task_done()
                break
            task = asyncio.create_task(process(*item))
            task.add_done_callback(lambda _: task_queue.task_done())
        await task_queue.join()

    print('event loop start')
    asyncio.run(main())
    print('event loop end')

if __name__ == '__main__':
    started = threading.Event()
    th = threading.Thread(target=do_it, args=(started,))
    th.start()
    started.wait()
    tasks, loop = started.queue, started.loop

    loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
    loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
    loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')
Run Code Online (Sandbox Code Playgroud)

注意:与您的代码不相关的问题是它等待的结果create_task(),这使得 的用处无效,create_task()因为不允许它在后台运行。(这相当于立即加入您刚刚启动的线程 - 您可以这样做,但没有多大意义。)这个问题在上面的代码和您对问题的编辑中都得到了修复。