How to extract value from span tag

zez*_*ima 5 html python beautifulsoup web-scraping

I am writing a simple web scraper to extract the game times for the ncaa basketball games. The code doesn't need to be pretty, just work. I have extracted the value from other span tags on the same page but for some reason I cannot get this one working.

from bs4 import BeautifulSoup as soup
import requests

url = 'http://www.espn.com/mens-college-basketball/game/_/id/401123420'
response = requests.get(url)
soupy = soup(response.content, 'html.parser')

containers = soupy.findAll("div",{"class" : "team-container"})
for container in containers:
    spans = container.findAll("span")
    divs = container.find("div",{"class": "record"})
    ranks = spans[0].text
    team_name = spans[1].text
    team_mascot = spans[2].text
    team_abbr = spans[3].text
    team_record = divs.text
    time_container = soupy.find("span", {"class":"time game-time"})
    game_times = time_container.text
    refs_container = soupy.find("div", {"class" : "game-info-note__container"})
    refs = refs_container.text
    print(ranks)
    print(team_name)
    print(team_mascot)
    print(team_abbr)
    print(team_record)
    print(game_times)
    print(refs)
Run Code Online (Sandbox Code Playgroud)

The specific code I am concerned about is this,

 time_container = soupy.find("span", {"class":"time game-time"})
    game_times = time_container.text
Run Code Online (Sandbox Code Playgroud)

I just provided the rest of the code to show that the .text on other span tags work. The time is the only data I truly want. I just get an empty string with how my code is currently.

This is the output of the code I get when I call time_container

<span class="time game-time" data-dateformat="time1" data-showtimezone="true"></span>
Run Code Online (Sandbox Code Playgroud)

or just '' when I do game_times.

Here is the line of the HTML from the website:

<span class="time game-time" data-dateformat="time1" data-showtimezone="true">6:10 PM CT</span>
Run Code Online (Sandbox Code Playgroud)

I don't understand why the 6:10 pm is gone when I run the script.

Aja*_*234 3

该网站是动态的,因此,您需要使用selenium

from selenium import webdriver
d = webdriver.Chrome('/path/to/chromedriver')
d.get('http://www.espn.com/mens-college-basketball/game/_/id/401123420')
game_time = soup(d.page_source, 'html.parser').find('span', {'class':'time game-time'}).text
Run Code Online (Sandbox Code Playgroud)

输出:

'7:10 PM ET'
Run Code Online (Sandbox Code Playgroud)

请参阅此处的完整selenium文档。