BeautifulSoup:获取类文本

Mpi*_*ris 6 python beautifulsoup

假设有以下代码:

for data in soup.findAll('div',{'class':'value'}):
    print(data)
Run Code Online (Sandbox Code Playgroud)

给出以下输出:

<div class="value">
<p class="name">Michael Jordan</p>
</div>


<div class="value">
<p class="team">Real Madrid</p>
</div>


<div class="value">
<p class="Sport">Ping Pong</p>
</div>
Run Code Online (Sandbox Code Playgroud)

我想创建以下字典:

  Person = {'name': 'Michael Jordan', 'team': 'Real Madrid', 'Sport': 'Ping Pong'}
Run Code Online (Sandbox Code Playgroud)

我可以使用文本获取文本,data.text但如何获取文本class以命名keys字典(人 [key1],Person[key2] ...)?

gtl*_*ert 12

您可以使用以下内容:

content = '''
<div class="value">
<p class="name">Michael Jordan</p>
</div>

<div class="value">
<p class="team">Real Madrid</p>
</div>

<div class="value">
<p class="Sport">Ping Pong</p>
</div>
'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(content)

person = {}

for div in soup.findAll('div', {'class': 'value'}):
    person[div.find('p').attrs['class'][0]] = div.text.strip()

print(person)
Run Code Online (Sandbox Code Playgroud)

输出

{'Sport': u'Ping Pong', 'name': u'Michael Jordan', 'team': u'Real Madrid'}
Run Code Online (Sandbox Code Playgroud)