aiohttp日志记录:如何区分不同请求的日志消息?

Mes*_*ssa 5 python logging python-asyncio aiohttp

想象一下,我有一个基于 Aiohttp 的 Web 应用程序:

from aiohttp import web
import asyncio
import logging

logger = logging.getLogger(__name__)

async def hello(request):
    logger.info('Started processing request')
    await asyncio.sleep(1)
    logger.info('Doing something')
    await asyncio.sleep(1)
    return web.Response(text="Hello, world!\n")

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s %(name)-14s %(levelname)s: %(message)s')

app = web.Application()
app.add_routes([web.get('/', hello)])
web.run_app(app)
Run Code Online (Sandbox Code Playgroud)

它的输出是(例如):

2019-11-11 13:37:14,757 __main__       INFO: Started processing request
2019-11-11 13:37:14,757 __main__       INFO: Started processing request
2019-11-11 13:37:15,761 __main__       INFO: Doing something
2019-11-11 13:37:15,761 __main__       INFO: Doing something
2019-11-11 13:37:16,765 aiohttp.access INFO: 127.0.0.1 [11/Nov/2019:12:37:14 +0000] "GET / HTTP/1.1" 200 165 "-" "curl/7.66.0"
2019-11-11 13:37:16,768 aiohttp.access INFO: 127.0.0.1 [11/Nov/2019:12:37:14 +0000] "GET / HTTP/1.1" 200 165 "-" "curl/7.66.0"
Run Code Online (Sandbox Code Playgroud)

我如何知道哪些日志消息属于哪些请求?

我想在每条日志消息中看到一些“请求 ID”(类似于微服务中的“相关 ID”)...

Mes*_*ssa 10

在“经典”非异步 Web 应用程序中,这很简单 - 一个进程(或线程)一次仅处理一个请求,因此您只需记录进程/线程 ID(日志格式:)%(process)d %(thread)d

在异步 (asyncio) 程序中,通常在单个线程中的事件循环中运行多个不同的事物(在 Web 应用程序中:正在处理不同的请求),因此记录进程/线程 ID 是不够的。您需要以某种方式识别不是操作系统线程,而是相关异步任务的“线程”——这就是ContextVar的用途。

第 1 步:创建上下文变量

request_id = ContextVar('request_id')
Run Code Online (Sandbox Code Playgroud)

第 2 步:为每个请求设置这个 contextvar 值

@web.middleware
async def add_request_id_middleware(request, handler):
    '''
    Aiohttp middleware that sets request_id contextvar and request['request_id']
    to some random value identifying the given request.
    '''
    req_id = secrets.token_urlsafe(5).replace('_', 'x').replace('-', 'X')
    request['request_id'] = req_id
    token = request_id.set(req_id)
    try:
            return await handler(request)
    finally:
        request_id.reset(token)

app = web.Application(middlewares=[add_request_id_middleware])
Run Code Online (Sandbox Code Playgroud)

第 3 步:在每条日志消息中自动插入这个 contextvar 值

def setup_log_record_factory():
    '''
    Wrap logging request factory so that [{request_id}] is prepended to each message
    '''
    old_factory = logging.getLogRecordFactory()

    def new_factory(*args, **kwargs):
        record = old_factory(*args, **kwargs)
        req_id = request_id.get(None)
        if req_id:
            record.msg = f'[{req_id}] {record.msg}'
        return record

    logging.setLogRecordFactory(new_factory)

setup_log_record_factory()
Run Code Online (Sandbox Code Playgroud)

第 4 步:由于 aiohttp 请求访问日志消息记录在我们设置上下文变量的范围之外,我们需要定义自己的 AccessLogger 来解决这个问题:

from aiohttp.web_log import AccessLogger

class CustomAccessLogger (AccessLogger):

    def log(self, request, response, time):
        token = request_id.set(request['request_id'])
        try:
            super().log(request, response, time)
        finally:
            request_id.reset(token)

web.run_app(app, access_log_class=CustomAccessLogger)
Run Code Online (Sandbox Code Playgroud)

完成示例输出:

2019-11-11 13:49:34,167 __main__       INFO: [cNniXu8] Started processing request
2019-11-11 13:49:34,168 __main__       INFO: [oWzMYds] Started processing request
2019-11-11 13:49:35,169 __main__       INFO: [cNniXu8] Doing something
2019-11-11 13:49:35,169 __main__       INFO: [oWzMYds] Doing something
2019-11-11 13:49:36,172 aiohttp.access INFO: [cNniXu8] 127.0.0.1 [11/Nov/2019:12:49:34 +0000] "GET / HTTP/1.1" 200 165 "-" "curl/7.66.0"
2019-11-11 13:49:36,174 aiohttp.access INFO: [oWzMYds] 127.0.0.1 [11/Nov/2019:12:49:34 +0000] "GET / HTTP/1.1" 200 165 "-" "curl/7.66.0"
Run Code Online (Sandbox Code Playgroud)

完整源代码在这里:gist.github.com/messa/c538fc267550ec67a1fed244183dcf1e

更新:我为此创建了一个库:) github.com/messa/aiohttp-request-id-logging