rez*_*eza 0 python beautifulsoup
我想通过数组中的美丽汤获取html页面中每个标签的所有属性
例如我有一个 html 页面我想要一个字符串数组中的所有标签属性
<div att0="content1">
<a href="link1">link data</a>
</div>
Run Code Online (Sandbox Code Playgroud)
结果将是:[content1, link1]
查找所有元素并从.attrsattribute获取属性:
attrs = []
for elm in soup(): # soup() is equivalent to soup.find_all()
attrs += list(elm.attrs.values())
print(attrs)
Run Code Online (Sandbox Code Playgroud)
演示:
>>> from bs4 import BeautifulSoup
>>>
>>> data = """
... <div att0="content1">
... <a href="link1">link data</a>
... </div>
... """
>>>
>>> soup = BeautifulSoup(data, 'lxml')
>>>
>>> attrs = []
>>> for elm in soup():
... attrs += list(elm.attrs.values())
...
>>> print(attrs)
['content1', 'link1']
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7865 次 |
| 最近记录: |