beautifulsoup:find_all on bs4.element.ResultSet对象还是列表？

Question

beautifulsoup:find_all on bs4.element.ResultSet对象还是列表？

YJZ*_*YJZ 13 html python beautifulsoup html-parsing

嗨所以我在a上应用find_all beautifulsoup object,并找到一些东西,这是一个bs4.element.ResultSet object或一个list.

我想在那里进一步做find_all,但是不允许这样做 bs4.element.ResultSet object.我可以循环遍历bs4.element.ResultSet objectfind_all的每个元素.但是我可以避免循环并将其转换回来beautifulsoup object吗？

请参阅代码了解详情.谢谢

html_1 = """
<table>
    <thead>
        <tr class="myClass">
            <th>A</th>
            <th>B</th>
            <th>C</th>
            <th>D</th>
        </tr>
    </thead>
</table>
"""
soup = BeautifulSoup(html_1, 'html.parser')

type(soup) #bs4.BeautifulSoup

# do find_all on beautifulsoup object
th_all = soup.find_all('th')

# the result is of type bs4.element.ResultSet or similarly list
type(th_all) #bs4.element.ResultSet
type(th_all[0:1]) #list

# now I want to further do find_all
th_all.find_all(text='A') #not work

# can I avoid this need of loop?
for th in th_all:
    th.find_all(text='A') #works

Run Code Online (Sandbox Code Playgroud)

Answer 1

ale*_*cxe 17

ResultSetclass是列表的子类,而不是具有定义方法的Tag类find*.循环结果find_all()是最常见的方法:

th_all = soup.find_all('th')
result = []
for th in th_all:
    result.extend(th.find_all(text='A'))

Run Code Online (Sandbox Code Playgroud)

通常情况下,CSS选择器可以帮助您一次性解决它,除非您可以find_all()使用该select()方法完成所有操作.例如,bs4CSS选择器中没有"文本"搜索.但是,例如,如果您必须在b元素内找到所有th元素,那么您可以:

soup.select("th td")

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，10 月前
查看次数：	25453 次
最近记录：	9 年，10 月前