使用带有beautifulsoup的python 2.7在html页面中标记的位置

Ran*_*jan 5 python beautifulsoup python-2.7

我试图用给定的格式解析一个html页面:

<img class="outer" id="first" />
<div class="content" .../>
<div class="content" .../>
<div class="content" />
<img class="outer" id="second" />
<div class="content" .../>
<div class="content" .../>
<img class="outer" id="third" />
<div class="content" .../>
<div class="content" .../>
Run Code Online (Sandbox Code Playgroud)

当迭代div标签时,我想弄清楚当前的div标签是否在id为'first','second'或'third'的img标签下.有没有办法做到这一点?我有img块和div块的列表:

img_blocks = soup.find_all('img', attrs={'class':'outer'})
div_Blocks = soup.find_all('div', attrs={'class':'content'})
Run Code Online (Sandbox Code Playgroud)

Ter*_*ryA 5

用途.find_previous_sibling:

>>> for divtag in div_Blocks:
...     print divtag.find_previous_sibling('img')
... 
<img class="outer" id="first"/>
<img class="outer" id="first"/>
<img class="outer" id="first"/>
<img class="outer" id="second"/>
<img class="outer" id="second"/>
<img class="outer" id="third"/>
<img class="outer" id="third"/>
Run Code Online (Sandbox Code Playgroud)