小编use*_*289的帖子

使用BeautifulSoup在两个h2标头之间获取文本

我想获取“描述”之后和“下一个标题”之前的文本。

我知道：

In [8]: soup.findAll('h2')[6]
Out[8]: <h2>Description</h2>

Run Code Online (Sandbox Code Playgroud)

但是，我不知道如何获取实际文本。问题是我有多个链接可以执行此操作。有些具有p：

                                         <h2>Description</h2>

  <p>This is the text I want </p>
<p>This is the text I want</p>   
                                        <h2>Next header</h2>

Run Code Online (Sandbox Code Playgroud)

但是，有些不这样做：

>                                       <h2>Description</h2>
>                        This is the text I want                 
> 
>                                       <h2>Next header</h2>

Run Code Online (Sandbox Code Playgroud)

同样在每个带有p的人上，我不能只做soup.findAll（'p'）[22]，因为在某些情况下，'p'是21或20。

python beautifulsoup

use*_*289

lucky-day

2
推荐指数

1
解决办法

2447
查看次数

标签统计

beautifulsoup ×1

python ×1

使用BeautifulSoup在两个h2标头之间获取文本

标签 统计

小编use_289的帖子

标签统计