WebSscping与BeautifulSoup,获得空列表

Question

WebSscping与BeautifulSoup,获得空列表

rez*_*ale 1 python beautifulsoup web-scraping

我正在通过https://www.wunderground.com/获取基本天气数据(如每日高/低温)来搜索网页图谱(搜索随机邮政编码).

我已经尝试了我的代码的各种变体,但它不断返回一个温度应该是的空列表.老实说,我只是不知道自己哪里出错了.谁能指出我正确的方向？

import requests
from bs4 import BeautifulSoup
response=requests.get('https://www.wunderground.com/cgi-bin/findweather/getForecast?query=76502')
response_data = BeautifulSoup(response.content, 'html.parser')
results=response_data.select("strong.high")

Run Code Online (Sandbox Code Playgroud)

我还尝试了以下各种其他变体:

results = response_data.find_all('strong', class_ = 'high')
results = response_data.select('div.small_6 columns > strong.high' )

Run Code Online (Sandbox Code Playgroud)

Answer 1

Vin*_*iar 5

您要解析的数据是由JavaScript动态创建的,requests无法处理.您应该selenium与PhantomJS或任何其他驱动程序一起使用.以下是使用selenium和的示例Chromedriver:

from selenium import webdriver
from bs4 import BeautifulSoup

url='https://www.wunderground.com/cgi-bin/findweather/getForecast?query=76502'
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source

soup = BeautifulSoup(html, 'html.parser')

Run Code Online (Sandbox Code Playgroud)

检查元素,最低,最高和当前温度可以使用:

high = soup.find('strong', {'class':'high'}).text
low = soup.find('strong', {'class':'low'}).text
now = soup.find('span', {'data-variable':'temperature'}).find('span').text

Run Code Online (Sandbox Code Playgroud)

>>> low, high, now
('25', '37', '36.5')

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，4 月前
查看次数：	835 次
最近记录：	8 年，4 月前