使用 FastAPI,如何在 OpenAPI (Swagger) 文档上的请求标头上将字符集添加到内容类型(媒体类型)?

Alu*_*uko 3 python-3.x swagger openapi fastapi

使用 FastAPI,如何在 OpenAPI (Swagger) 文档上的请求标头上将字符集添加到内容类型(媒体类型)?

@app.post("/")
def post_hello(username: str = Form(...)):
   return {"Hello": username}
Run Code Online (Sandbox Code Playgroud)

OpenAPI (http:///docs) 显示“ application/x-www-form-urlencoded ”。

openapi文档

我尝试改变如下:

def post_hello(username: str = Form(..., media_type="application/x-www-form-urlencoded; charset=cp932")):
   return {"Hello": "World!", "userName": username}
Run Code Online (Sandbox Code Playgroud)

但不能添加charset=cp932

我想根据请求将“ application/x-www-form-urlencoded; charset=cp932 ”设置为 Content-Type 。我想获得由字符集解码的用户名。

Mat*_*son 5

使用 FastAPI,如何添加charset到自动生成的 OpenAPI (Swagger) 文档中的 Content-Type 请求标头?

\n
@app.post("/")\ndef post_hello(username: str = Form(...)):\n   return {"Hello": username}\n
Run Code Online (Sandbox Code Playgroud)\n

http://<root_path>/docs通过上述路线,在显示处生成的 OpenAPI 文档application/x-www-form-urlencoded

\n

openapi 文档屏幕截图,在 Content-Type 下拉列表中添加注释 \'我想添加字符集,如“application/x-www-form-urlencoded;charset=cp932”

\n

我试过这个:

\n
@app.post("/")\ndef post_hello(username: str = Form(..., media_type="application/x-www-form-urlencoded; charset=cp932")):\n   return {"Hello": "World!", "userName": username}\n
Run Code Online (Sandbox Code Playgroud)\n

但文档仍然只显示application/x-www-form-urlencoded.

\n

我想设置为来自此端点/路径函数的响应中的application/x-www-form-urlencoded; charset=cp932值。Content-Type我希望使用该编码方案对收到的\n表单数据进行解码。

\n

简短回答

\n

在一般情况下,这似乎不是一个好主意;我认为这并不容易, 内置方法;这可能没有必要。

\n

不是一个好主意(又名不符合标准)

\n

这个 GitHub 问题讨论了为什么附加;charset=UTF-8到\napplication/json不是一个好主意,并且在那里提出的相同点\n也适用于本例。

\n

HTTP/1.1 规范规定Content-Type标头列出了媒体类型。

\n

注意:HTTP/2 与 HTTP/1.1 共享这些组件

\n

IANA 管理所有常用媒体类型 (MIME) 的注册。

\n

条目application/x-www-form-urlencoded说:

\n
Media type name: application\nMedia subtype name: x-www-form-urlencoded\n\nRequired parameters: No parameters\n\nOptional parameters:\nNo parameters\n\nEncoding considerations: 7bit\n
Run Code Online (Sandbox Code Playgroud)\n

将此与以下条目进行比较text/html

\n
MIME media type name : Text\n\nMIME subtype name : Standards Tree - html\n\nRequired parameters : No required parameters\n\nOptional parameters :\ncharset\nThe charset parameter may be provided to definitively specify the document\'s character encoding, overriding any character encoding declarations in the document. The parameter\'s value must be one of the labels of the character encoding used to serialize the file.\n\nEncoding considerations : 8bit\n
Run Code Online (Sandbox Code Playgroud)\n

application/x-www-form-urlencoded不允许charset添加 的条目。那么应该如何从字节解码呢?URL 规范指出:

\n
\n
    \n
  1. nameStringvalueString分别为namevalue百分比解码运行UTF-8 解码\n不带 BOM的结果。
  2. \n
\n
\n

听起来,无论编码是什么,解码时都应始终使用 UTF-8。

\n

当前的 HTML/URL 规范也有关于\n 的注释application/x-www-form-urlencoded

\n
\n

application/x-www-form-urlencoded格式在许多方面都是一种异常的\n怪物,是多年实施事故和\n妥协的结果,\n导致了一组互操作性所需的要求,\n但绝不代表良好的设计实践。特别是,请读者密切注意涉及字符编码和字节序列之间重复(在某些情况下是嵌套)转换的扭曲细节。遗憾的是,由于 HTML 表单的盛行,该格式已被广泛使用。\n

\n
\n

所以听起来采取不同的做法并不是一个好主意。

\n

没有内置方式

\n
\n

注意:执行这些解决方案的内置方法是使用自定义Request类。

\n
\n

构建/openapi.json对象时,当前版本的 FastAPI\n检查依赖项是否是 的实例Form,然后使用空Form实例来构建架构,即使实际依赖项是 的子类Form

\n

media_type参数的默认值为Form.__init__\n application/x-www-form-urlencoded因此每个\n具有依赖项的端点/路径函数Form()将在文档中显示相同的媒体类型\n,即使该类__init__(具有media_type参数。

\n

有几种方法可以更改 中列出的内容/openapi.json,即用于生成文档的内容,FastAPI 文档列出了一种官方方法。

\n

对于问题中的示例,这将有效:

\n
from fastapi import FastAPI, Form\nfrom fastapi.openapi.utils import get_openapi\n\napp = FastAPI()\n\n\n@app.post("/")\ndef post_hello(username: str = Form(...)):\n    return {"Hello": username}\n\n\ndef custom_openapi():\n    if app.openapi_schema:\n        return app.openapi_schema\n\n    app.openapi_schema = get_openapi(\n        title=app.title,\n        version=app.version,\n        openapi_version=app.openapi_version,\n        description=app.description,\n        terms_of_service=app.terms_of_service,\n        contact=app.contact,\n        license_info=app.license_info,\n        routes=app.routes,\n        tags=app.openapi_tags,\n        servers=app.servers,\n    )\n\n    requestBody = app.openapi_schema["paths"]["/"]["post"]["requestBody"]\n    content = requestBody["content"]\n    new_content = {\n        "application/x-www-form-urlencoded;charset=cp932": content[\n            "application/x-www-form-urlencoded"\n        ]\n    }\n    requestBody["content"] = new_content\n\n    return app.openapi_schema\n\n\napp.openapi = custom_openapi\n
Run Code Online (Sandbox Code Playgroud)\n

值得注意的是,通过此更改,文档用户界面改变了实验部分的呈现方式:

\n

显示预填充了 JSON 对象 {"username": "string"} 的文本框输入的屏幕截图

\n

application/x-www-form-urlencoded与未charset指定 \n 的显示方式相比:

\n

显示带有名为“用户名”的单个字段的表单输入的屏幕截图

\n

上述更改只会更改文档中列出的媒体类型。发送到端点/路径函数的任何表单\n数据仍将是:

\n
    \n
  1. 解析python-multipart(大致遵循规范中描述的相同步骤
  2. \n
  3. 解码starletteLatin-1
  4. \n
\n

因此,即使starlette更改为使用不同的编码方案来解码表单数据,仍需遵循规范中概述的步骤,\n例如为python-multipart使用硬编码字节值。&;

\n

幸运的是,前 128 个字符/代码点中的大多数*都映射到 cp932 和 UTF-8 之间的相同字节序列,因此&,\n;=全部结果相同。

\n

*除了0x5C,有时\xc2\xa5

\n

更改为使用 cp932 编码的一种方法starlette是使用中间件:

\n
import typing\nfrom unittest.mock import patch\nfrom urllib.parse import unquote_plus\n\nimport multipart\nfrom fastapi import FastAPI, Form, Request, Response\nfrom fastapi.openapi.utils import get_openapi\nfrom multipart.multipart import parse_options_header\nfrom starlette.datastructures import FormData, UploadFile\nfrom starlette.formparsers import FormMessage, FormParser\n\napp = FastAPI()\n\nform_path = "/"\n\n\n@app.post(form_path)\nasync def post_hello(username: str = Form(...)):\n    return {"Hello": username}\n\n\ndef custom_openapi():\n    if app.openapi_schema:\n        return app.openapi_schema\n\n    app.openapi_schema = get_openapi(\n        title=app.title,\n        version=app.version,\n        openapi_version=app.openapi_version,\n        description=app.description,\n        terms_of_service=app.terms_of_service,\n        contact=app.contact,\n        license_info=app.license_info,\n        routes=app.routes,\n        tags=app.openapi_tags,\n        servers=app.servers,\n    )\n\n    requestBody = app.openapi_schema["paths"]["/"]["post"]["requestBody"]\n    content = requestBody["content"]\n    new_content = {\n        "application/x-www-form-urlencoded;charset=cp932": content[\n            "application/x-www-form-urlencoded"\n        ]\n    }\n    requestBody["content"] = new_content\n\n    return app.openapi_schema\n\n\napp.openapi = custom_openapi\n\n\nclass CP932FormParser(FormParser):\n    async def parse(self) -> FormData:\n        """\n        copied from:\n        https://github.com/encode/starlette/blob/0.17.1/starlette/formparsers.py#L72-L110\n        """\n        # Callbacks dictionary.\n        callbacks = {\n            "on_field_start": self.on_field_start,\n            "on_field_name": self.on_field_name,\n            "on_field_data": self.on_field_data,\n            "on_field_end": self.on_field_end,\n            "on_end": self.on_end,\n        }\n\n        # Create the parser.\n        parser = multipart.QuerystringParser(callbacks)\n        field_name = b""\n        field_value = b""\n\n        items: typing.List[typing.Tuple[str, typing.Union[str, UploadFile]]] = []\n\n        # Feed the parser with data from the request.\n        async for chunk in self.stream:\n            if chunk:\n                parser.write(chunk)\n            else:\n                parser.finalize()\n            messages = list(self.messages)\n            self.messages.clear()\n            for message_type, message_bytes in messages:\n                if message_type == FormMessage.FIELD_START:\n                    field_name = b""\n                    field_value = b""\n                elif message_type == FormMessage.FIELD_NAME:\n                    field_name += message_bytes\n                elif message_type == FormMessage.FIELD_DATA:\n                    field_value += message_bytes\n                elif message_type == FormMessage.FIELD_END:\n                    name = unquote_plus(field_name.decode("cp932"))  # changed line\n                    value = unquote_plus(field_value.decode("cp932"))  # changed line\n                    items.append((name, value))\n\n        return FormData(items)\n\n\nclass CustomRequest(Request):\n    async def form(self) -> FormData:\n        """\n        copied from\n        https://github.com/encode/starlette/blob/0.17.1/starlette/requests.py#L238-L253\n        """\n        if not hasattr(self, "_form"):\n            assert (\n                parse_options_header is not None\n            ), "The `python-multipart` library must be installed to use form parsing."\n            content_type_header = self.headers.get("Content-Type")\n            content_type, options = parse_options_header(content_type_header)\n            if content_type == b"multipart/form-data":\n                multipart_parser = MultiPartParser(self.headers, self.stream())\n                self._form = await multipart_parser.parse()\n            elif content_type == b"application/x-www-form-urlencoded":\n                form_parser = CP932FormParser(\n                    self.headers, self.stream()\n                )  # use the custom parser above\n                self._form = await form_parser.parse()\n            else:\n                self._form = FormData()\n        return self._form\n\n\n@app.middleware("http")\nasync def custom_form_parser(request: Request, call_next) -> Response:\n    if request.scope["path"] == form_path:\n        # starlette creates a new Request object for each middleware/app\n        # invocation:\n        # https://github.com/encode/starlette/blob/0.17.1/starlette/routing.py#L59\n        # this temporarily patches the Request object starlette\n        # uses with our modified version\n        with patch("starlette.routing.Request", new=CustomRequest):\n            return await call_next(request)\n
Run Code Online (Sandbox Code Playgroud)\n

然后,必须手动对数据进行编码:

\n
>>> import sys\n>>> from urllib.parse import quote_plus\n>>> name = quote_plus("username").encode("cp932")\n>>> value = quote_plus("cp932\xe6\x96\x87\xe5\xad\x97\xe3\x82\xb3\xe3\x83\xbc\xe3\x83\x89").encode("cp932")\n>>> with open("temp.txt", "wb") as file:\n...     file.write(name + b"=" + value)\n...\n59\n
Run Code Online (Sandbox Code Playgroud)\n

并作为二进制数据发送:

\n
$ curl -X \'POST\' \\\n  \'http://localhost:8000/\' \\\n  -H \'accept: application/json\' \\\n  -H \'Content-Type: application/x-www-form-urlencoded;charset=cp932\' \\\n  --data-binary "@temp.txt" \\\n  --silent \\\n| jq -C .\n\n{\n  "Hello": "cp932\xe6\x96\x87\xe5\xad\x97\xe3\x82\xb3\xe3\x83\xbc\xe3\x83\x89"\n}\n
Run Code Online (Sandbox Code Playgroud)\n

可能没有必要

\n

在手动编码步骤中,输出将如下所示:

\n
username=cp932%E6%96%87%E5%AD%97%E3%82%B3%E3%83%BC%E3%83%89\n
Run Code Online (Sandbox Code Playgroud)\n

百分比编码步骤的一部分将表示高于0x7E~以 ASCII 表示)的字符的任何字节替换为缩减的 ASCII 范围内的字节。由于cp932和 UTF-8 都将这些字节映射到相同的代码点(除了0x5C可能是\\\xc2\xa5),因此字节序列将解码为相同的字符串:

\n
$ curl -X \'POST\' \\\n  \'http://localhost:8000/\' \\\n  -H \'accept: application/json\' \\\n  -H \'Content-Type: application/x-www-form-urlencoded;charset=cp932\' \\\n  --data-urlencode "username=cp932\xe6\x96\x87\xe5\xad\x97\xe3\x82\xb3\xe3\x83\xbc\xe3\x83\x89" \\\n  --silent \\\n| jq -C .\n\n{\n  "Hello": "cp932\xe6\x96\x87\xe5\xad\x97\xe3\x82\xb3\xe3\x83\xbc\xe3\x83\x89"\n}\n
Run Code Online (Sandbox Code Playgroud)\n

这仅适用于百分比编码数据。

\n

任何未使用百分比编码发送的数据都将以不同于发送者预期的方式进行处理和解释。例如,在 OpenAPI (Swagger) 文档中,“尝试一下”实验部分给出了一个示例curl -d(与 相同--data),该示例不对数据进行编码

\n
$ curl -X \'POST\' \\\n  \'http://localhost:8000/\' \\\n  -H \'accept: application/json\' \\\n  -H \'Content-Type: application/x-www-form-urlencoded\' \\\n  --data "username=cp932\xe6\x96\x87\xe5\xad\x97\xe3\x82\xb3\xe3\x83\xbc\xe3\x83\x89" \\\n  --silent \\\n| jq -C .\n{\n  "Hello": "cp932\xc3\xa6\xe2\x80\x93\xe2\x80\xa1\xc3\xa5\xc2\xad\xe2\x80\x94\xc3\xa3\xe2\x80\x9a\xc2\xb3\xc3\xa3\xc6\x92\xc2\xbc\xc3\xa3\xc6\x92\xe2\x80\xb0"\n}\n
Run Code Online (Sandbox Code Playgroud)\n

仅使用 cp932 处理来自以与服务器类似的方式配置的发件人的请求可能仍然是一个好主意。

\n

实现此目的的一种方法是修改中间件函数,以便仅在发送方指定数据已使用 ncp932 进行编码的情况下处理这样的数据:

\n
import typing\nfrom unittest.mock import patch\nfrom urllib.parse import unquote_plus\n\nimport multipart\nfrom fastapi import FastAPI, Form, Request, Response\nfrom fastapi.openapi.utils import get_openapi\nfrom multipart.multipart import parse_options_header\nfrom starlette.datastructures import FormData, UploadFile\nfrom starlette.formparsers import FormMessage, FormParser\n\napp = FastAPI()\n\nform_path = "/"\n\n\n@app.post(form_path)\nasync def post_hello(username: str = Form(...)):\n    return {"Hello": username}\n\n\ndef custom_openapi():\n    if app.openapi_schema:\n        return app.openapi_schema\n\n    app.openapi_schema = get_openapi(\n        title=app.title,\n        version=app.version,\n        openapi_version=app.openapi_version,\n        description=app.description,\n        terms_of_service=app.terms_of_service,\n        contact=app.contact,\n        license_info=app.license_info,\n        routes=app.routes,\n        tags=app.openapi_tags,\n        servers=app.servers,\n    )\n\n    requestBody = app.openapi_schema["paths"]["/"]["post"]["requestBody"]\n    content = requestBody["content"]\n    new_content = {\n        "application/x-www-form-urlencoded;charset=cp932": content[\n            "application/x-www-form-urlencoded"\n        ]\n    }\n    requestBody["content"] = new_content\n\n    return app.openapi_schema\n\n\napp.openapi = custom_openapi\n\n\nclass CP932FormParser(FormParser):\n    async def parse(self) -> FormData:\n        """\n        copied from:\n        https://github.com/encode/starlette/blob/0.17.1/starlette/formparsers.py#L72-L110\n        """\n        # Callbacks dictionary.\n        callbacks = {\n            "on_field_start": self.on_field_start,\n            "on_field_name": self.on_field_name,\n            "on_field_data": self.on_field_data,\n            "on_field_end": self.on_field_end,\n            "on_end": self.on_end,\n        }\n\n        # Create the parser.\n        parser = multipart.QuerystringParser(callbacks)\n        field_name = b""\n        field_value = b""\n\n        items: typing.List[typing.Tuple[str, typing.Union[str, UploadFile]]] = []\n\n        # Feed the parser with data from the request.\n        async for chunk in self.stream:\n            if chunk:\n                parser.write(chunk)\n            else:\n                parser.finalize()\n            messages = list(self.messages)\n            self.messages.clear()\n            for message_type, message_bytes in messages:\n                if message_type == FormMessage.FIELD_START:\n                    field_name = b""\n                    field_value = b""\n                elif message_type == FormMessage.FIELD_NAME:\n                    field_name += message_bytes\n                elif message_type == FormMessage.FIELD_DATA:\n                    field_value += message_bytes\n                elif message_type == FormMessage.FIELD_END:\n                    name = unquote_plus(field_name.decode("cp932"))  # changed line\n                    value = unquote_plus(field_value.decode("cp932"))  # changed line\n                    items.append((name, value))\n\n        return FormData(items)\n\n\nclass CustomRequest(Request):\n    async def form(self) -> FormData:\n        """\n        copied from\n        https://github.com/encode/starlette/blob/0.17.1/starlette/requests.py#L238-L253\n        """\n        if not hasattr(self, "_form"):\n            assert (\n                parse_options_header is not None\n            ), "The `python-multipart` library must be installed to use form parsing."\n            content_type_header = self.headers.get("Content-Type")\n            content_type, options = parse_options_header(content_type_header)\n            if content_type == b"multipart/form-data":\n                multipart_parser = MultiPartParser(self.headers, self.stream())\n                self._form = await multipart_parser.p