如何从HTTP头响应中解析Content-Type的值？

Question

如何从HTTP头响应中解析Content-Type的值？

Ghe*_*Ace 5 python content-type mime-types python-requests

我正在开发一个应用程序,以从互联网上获取所有类型的东西.我希望不要为此编写RegExp模式的路径,因此,我如何解析Content-Type标题中的值:在示例中:

text/html; charset=UTF-8

Run Code Online (Sandbox Code Playgroud)

为了给出上下文,这是我在互联网上获取内容的代码:

from requests import head

foo = head("http://www.example.com")

Run Code Online (Sandbox Code Playgroud)

*编辑*

我期待的输出类似于mimetools中的方法.例如:

x = magic("text/html; charset=UTF-8")

Run Code Online (Sandbox Code Playgroud)

将输出:

x.getparam('charset')  # UTF-8
x.getmaintype()  # text
x.getsubtype()  # html

Run Code Online (Sandbox Code Playgroud)

Answer 1

Owe*_* S. 9

requests不幸的是,并没有给你一个解析内容类型的界面,这个东西上的标准库有点混乱.所以我看到两个选择:

选项1:使用python-mimeparse第三方库.

选项2:要将mime类型与类似的选项分开charset,您可以使用与requests内部解析类型/编码相同的技术:use cgi.parse_header.

response = requests.head('http://example.com')
mimetype, options = cgi.parse_header(response.headers['Content-Type'])

Run Code Online (Sandbox Code Playgroud)

其余的应该足够简单,以处理split:

maintype, subtype = mimetype.split('/')

Run Code Online (Sandbox Code Playgroud)

Answer 2

Phi*_*ing 6

Python 有这个内置函数。 它在email模块中。

MIME和 mime 类型是已在其他上下文中采用的电子邮件标准：“多用途Internet邮件扩展”（请参阅 RFC 2045）。

可靠地做到这一点的最简单方法是使用电子邮件解析器：

from email.message import Message

_CONTENT_TYPE = "content-type"

def parse_content_type(content_type: str) -> tuple[str, dict[str,str]]:
    email = Message()
    email[_CONTENT_TYPE] = content_type
    params = email.get_params()
    # The first param is the mime-type the later ones are the attributes like "charset"
    return params[0][0], dict(params[1:])

Run Code Online (Sandbox Code Playgroud)

Answer 3

lip*_*nen -1

你的问题有点不清楚。我假设您正在使用某种 Web 应用程序框架，例如 Django 或 Flask？

以下是如何使用 Flask 读取 Content-Type 的示例：

from flask import Flask, request
app = Flask(__name__)

@app.route("/")
def test():
  request.headers.get('Content-Type')


if __name__ == "__main__":
  app.run()

Run Code Online (Sandbox Code Playgroud)

看起来他正在做某种爬虫而不是网络应用程序。 (2认同)

归档时间：	10 年，5 月前
查看次数：	4589 次
最近记录：	9 年，9 月前