Django：将 HTML（包含表单）解析为字典

Question

Django：将 HTML（包含表单）解析为字典

我在服务器端创建了一个 html 表单。

<form action="." method="POST">
 <input type="text" name="foo" value="bar">
 <textarea name="area">long text</textarea>
 <select name="your-choice">
  <option value="a" selected>A</option>
  <option value="b">B</option>
 </select>
</form>

Run Code Online (Sandbox Code Playgroud)

期望的结果：

{
 "foo": "bar",
 "area": "long text",
 "your-choice": "a",
}

Run Code Online (Sandbox Code Playgroud)

我正在寻找的方法 ( parse_form()) 可以这样使用：

response = client.get('/foo/')

# response contains <form> ...</form>

data = parse_form(response.content)

data['my-input']='bar'

response = client.post('/foo/', data)

Run Code Online (Sandbox Code Playgroud)

如何parse_form()在Python中实现？

这与 Django 无关，尽管如此，Django 中有一个功能请求，但几年前被拒绝： https: //code.djangoproject.com/ticket/11797

更新

我围绕基本lxml答案编写了一个小型Python库：html_form_to_dict

Answer 1

And*_*ker 5

这个和django无关，只是和html解析有关。标准工具是BeautifulSoup (bs4)库。

它可以解析任意 HTML，并且经常用于网络爬虫（包括我自己的）。这个问题涵盖了解析 html 表单：Python beautiful soup form input parsing，几乎你需要的一切都可以在这里找到答案:)

from bs4 import BeautifulSoup

def selected_option(select):
    option = select.find("option", selected=True)
    if option: 
        return option['value']

# tag name => how to extract its value
tags = {  
    "input": lambda t: t['value'],
    "textarea": lambda t: t.text,
    "select": selected_option
}


def parse_form(html):
    soup = BeautifulSoup(html, 'html.parser')
    form = soup.find("form")
    return {
        e['name']: tags[e.name](e)
        for e in form.find_all(tags.keys())
    }

Run Code Online (Sandbox Code Playgroud)

这将为您的输入提供以下输出：

{
    "foo": "bar",
    "area": "long text",
    "your-choice": "a"
}

Run Code Online (Sandbox Code Playgroud)

对于生产，您将需要添加大量错误检查，例如找不到表单、没有名称的输入等。这取决于具体需要什么。

归档时间：	4 年，10 月前
查看次数：	2989 次
最近记录：	3 年前