use*_*862 6 python beautifulsoup
如何用<div data-role="content"></div>漂亮的汤包裹 html body 的内容?
我尝试从以下方面开始,但未能取得任何进展:
from bs4 import BeautifulSoup
soup = BeautifulSoup(u"%s" % response)
wrapper = soup.new_tag('div', **{"data-role":"content"})
soup.body.append(wrapper)
for content in soup.body.contents:
wrapper.append(content)
Run Code Online (Sandbox Code Playgroud)
我也尝试使用 body.children 但没有运气。
这会将包装器附加到正文,但不会像我需要的那样包装正文内容
- 编辑 -
我已经到了这里,但现在我最终得到了像这样的重复的 body 元素<body><div data-role="content"><body>content here</body></div></body>
from bs4 import BeautifulSoup
soup = BeautifulSoup(u"%s" % response)
wrapper = soup.new_tag('div', **{"data-role":"content"})
new_body = soup.new_tag('body')
contents = soup.body.replace_with(new_body)
wrapper.append(contents)
new_body.append(wrapper)
Run Code Online (Sandbox Code Playgroud)
BeautifulSoup 的完美用例wrap():
from bs4 import BeautifulSoup, Tag
response = """
<body>
<p>test1</p>
<p>test2</p>
</body>
"""
soup = BeautifulSoup(response, 'html.parser')
wrapper = soup.new_tag('div', **{"data-role": "content"})
soup.body.wrap(wrapper)
print soup.prettify()
Run Code Online (Sandbox Code Playgroud)
印刷:
<div data-role="content">
<body>
<p>
test1
</p>
<p>
test2
</p>
</body>
</div>
Run Code Online (Sandbox Code Playgroud)
更新:
from bs4 import BeautifulSoup
response = """<html>
<head>
<title>test</title>
</head>
<body>
<p>test</p>
</body>
</html>
"""
soup = BeautifulSoup(response)
wrapper = soup.new_tag('div', **{"data-role": "content"})
soup.body.wrap(wrapper)
print soup.prettify()
Run Code Online (Sandbox Code Playgroud)
产生:
<html>
<head>
<title>
test
</title>
</head>
<div data-role="content">
<body>
<p>
test
</p>
</body>
</div>
</html>
Run Code Online (Sandbox Code Playgroud)