gar*_*heg 5 html python tags parsing beautifulsoup
我需要创建一个<img />标签.BeautifulSoup用我做的代码创建了这样的图像标签:
soup = BeautifulSoup(text, "html5")
tag = Tag(soup, name='img')
tag.attrs = {'src': '/some/url/here'}
text = soup.renderContents()
print text
Run Code Online (Sandbox Code Playgroud)
输出: <img src="/some/url/here"></img>
怎么做?:<img src="/some/url/here" />
它当然可以用REGEX或类似的化学方法完成.但是我想知道是否有任何标准方法来生成这样的标签?
不要Tag()用来创建新元素.使用soup.new_tag()方法:
soup = BeautifulSoup(text, "html5")
new_tag = soup.new_tag('img', src='/some/url/here')
some_element.append(new_tag)
Run Code Online (Sandbox Code Playgroud)
该soup.new_tag()方法将正确的构建器传递给Tag()对象,并且构建器负责将其识别<img/>为空标记.
演示:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<div></div>', "html5")
>>> new_tag = soup.new_tag('img', src='/some/url/here')
>>> new_tag
<img src="/some/url/here"/>
>>> soup.div.append(new_tag)
>>> print soup.prettify()
<html>
<head>
</head>
<body>
<div>
<img src="/some/url/here"/>
</div>
</body>
</html>
Run Code Online (Sandbox Code Playgroud)