dbr*_*rrt 5 beautifulsoup siblings python-3.x
如何使用BeautifulSoup(bs4)检索(不递归)所有孩子?
<div class='body'><span>A</span><span><span>B</span></span><span>C</span></div>
我想得到这样的块:
block1 : <span>A</span>
block2 : <span><span>B</span></span>
block3 : <span>C</span>
Run Code Online (Sandbox Code Playgroud)
我是这样做的:
for j in soup.find_all(True)[:1]:
if isinstance(j, NavigableString):
continue
if isinstance(j, Tag):
tags.append(j.name)
# Get siblings
for k in j.find_next_siblings():
# k is sibling of first element
Run Code Online (Sandbox Code Playgroud)
有更清洁的方法吗?
t.m*_*dam 12
如果要仅选择直接后代,可以将recursive参数设置为False.
您提供的html示例:
from bs4 import BeautifulSoup
html = "<div class='body'><span>A</span><span><span>B</span></span><span>C</span></div>"
soup = BeautifulSoup(html, "lxml")
for j in soup.div.find_all(recursive=False):
print(j)
Run Code Online (Sandbox Code Playgroud)
<span>A</span>
<span><span>B</span></span>
<span>C</span>
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6319 次 |
| 最近记录: |