从element.ResultSet中提取项目

Dee*_*Roy 4 python beautifulsoup web-scraping python-3.x

我找到了一个很酷的Python脚本,该脚本可从NFL名册上刮取球员信息。但是,我想将NFL Combine结果添加到数据中。我在下面提供了一个玩家的示例。

import urllib.request
from bs4 import BeautifulSoup

URL2 = 'www.nfl.com/player/deandrewwhite/2552657/combine'
soupCombine = BeautifulSoup(urllib.request.urlopen(URL2))
Combinestats = soupCombine.find_all("div", attrs = {"class": "tp-title"})
Combinestats[0].contents
Run Code Online (Sandbox Code Playgroud)

产生:

['3 Cone Drill', < span class="tp-results">6.97 secs< /span>]
Run Code Online (Sandbox Code Playgroud)

如何从Combinestats [0] .contents中获取以下内容?

DrillName = '3 Cone Drill'

DrillResult = 6.97
Run Code Online (Sandbox Code Playgroud)

供参考的是Combinestats中的项目。

for ii in range(len(Combinestats)):
     print(Combinestats[ii].contents)

['3 Cone Drill', <span class="tp-results">6.97 secs</span>]
['40 Yard Dash', <span class="tp-results">4.44 Secs</span>]
['Broad Jump', <span class="tp-results">118.0 inches</span>]
['20 Yard Shuttle', <span class="tp-results">4.18 secs</span>]
['Vertical Jump', <span class="tp-results">34.5 inches</span>]
Run Code Online (Sandbox Code Playgroud)

cs9*_*s95 5

只需使用列表理解即可。

resultSet = soup.find_all("div", attrs = {"class": "tp-title"})
stats = [
    (i.contents[0], i.contents[1].text) for i in resultSet

]
Run Code Online (Sandbox Code Playgroud)

或者,for循环。

stats = []
for i in resultSet:
    stats.append(i.contents[0], i.contents[1].text)
Run Code Online (Sandbox Code Playgroud)

print(stats)
[
    ('40 Yard Dash', '4.44 Secs'),
    ('3 Cone Drill', '6.97 secs'),
    ('Broad Jump', '118.0 inches'),
    ('20 Yard Shuttle', '4.18 secs'),
    ('Vertical Jump', '34.5 inches')
]
Run Code Online (Sandbox Code Playgroud)