如何使用美丽的汤来获得两个不同的标签之间的价值？

Question

如何使用美丽的汤来获得两个不同的标签之间的价值？

utk*_*thi 6 python beautifulsoup html-parsing

我需要
在下面的代码片段中提取结尾标记和标记之间的数据:

<td><b>First Type :</b>W<br><b>Second Type :</b>65<br><b>Third Type :</b>3</td>

Run Code Online (Sandbox Code Playgroud)

我需要的是:W,65,3

但问题是这些值也可能是空的,如 -

<td><b>First Type :</b><br><b>Second Type :</b><br><b>Third Type :</b></td>

Run Code Online (Sandbox Code Playgroud)

如果存在其他空字符串,我想获取这些值

我尝试使用nextSibling和find_next('br'),但它返回了

 <br><b>Second Type :</b><br><b>Third Type :</b></br></br>

Run Code Online (Sandbox Code Playgroud)

和

<br><b>Third Type :</b></br>

Run Code Online (Sandbox Code Playgroud)

如果标签之间不存在值(W,65,3)

</b> and <br>

Run Code Online (Sandbox Code Playgroud)

我需要的是,如果这些标签之间没有任何内容,它应该返回一个空字符串.

Answer 1

DMP*_*rre 5

我会用一个<b>由标签</b>标签策略，看什么信息的类型及其next_sibling包含的内容。

我只会检查他们next_sibling.string是否不是None，并相应地附加列表:)

>>> html = """<td><b>First Type :</b><br><b>Second Type :</b>65<br><b>Third Type :</b>3</td>"""

>>> soup = BeautifulSoup(html, "html.parser")
>>> b = soup.find_all("b")
>>> data = []
>>> for tag in b:
        if tag.next_sibling.string == None:
            data.append(" ")
        else:
            data.append(tag.next_sibling.string)
>>> data 
[' ', u'65', u'3'] # Having removed the first string

Run Code Online (Sandbox Code Playgroud)

希望这可以帮助！

归档时间：	9 年，2 月前
查看次数：	1736 次
最近记录：	9 年，2 月前