如何提取特定标题下方的所有文本?在这种情况下,我需要提取 下的文本Topic 2。 编辑: 在其他网页上,“主题 2”有时显示为第三个标题或第一个标题。“主题 2”并不总是在同一个位置,并且它并不总是具有相同的 ID 号。
# import library
from bs4 import BeautifulSoup
# dummy webpage text
body = '''
<h2 id="1">Topic 1</h2>
<p> This is the first sentence.</p>
<p> This is the second sentence.</p>
<p> This is the third sentence.</p>
<h2 id="2">Topic 2</h2>
<p> This is the fourth sentence.</p>
<p> This is the fifth sentence.</p>
<h2 id="3">Topic 3</h2>
<p> This is the sixth sentence.</p>
<p> This is the seventh sentence.</p>
<p> This is the …Run Code Online (Sandbox Code Playgroud)