小编PSe*_*ode的帖子

使用Beautiful Soup查找下一个出现的标签及其附带的文本

我正在尝试解析标记之间的文本<blockquote>.当我输入soup.blockquote.get_text().

我得到了我想要的HTML文件中第一个出现的blockquote的结果.如何<blockquote>在文件中找到下一个和顺序标记？也许我只是累了,在文档中找不到它.

示例HTML文件:

<html>
<head>header
</head>
<blockquote>I can get this text
</blockquote>
<p>eiaoiefj</p>
<blockquote>trying to capture this next
</blockquote>
<p></p><strong>do not capture this</strong>
<blockquote>
capture this too but separately after "capture this next"
</blockquote>
</html>

Run Code Online (Sandbox Code Playgroud)

简单的python代码:

from bs4 import BeautifulSoup

html_doc = open("example.html")
soup = BeautifulSoup(html_doc)
print.(soup.blockquote.get_text())
# how to get the next blockquote???

Run Code Online (Sandbox Code Playgroud)

html python beautifulsoup python-2.7

PSe*_*ode

2018 08-06

12
推荐指数

1
解决办法

2万
查看次数