小编Ste*_*ner的帖子

BeautifulSoup HTML表解析

我试图从这个网站解析信息(html表):http://www.511virginia.org/RoadConditions.aspx?j = All&r = 1

目前我正在使用BeautifulSoup,我的代码看起来像这样

from mechanize import Browser
from BeautifulSoup import BeautifulSoup

mech = Browser()

url = "http://www.511virginia.org/RoadConditions.aspx?j=All&r=1"
page = mech.open(url)

html = page.read()
soup = BeautifulSoup(html)

table = soup.find("table")

rows = table.findAll('tr')[3]

cols = rows.findAll('td')

roadtype = cols[0].string
start = cols.[1].string
end = cols[2].string
condition = cols[3].string
reason = cols[4].string
update = cols[5].string

entry = (roadtype, start, end, condition, reason, update)

print entry
Run Code Online (Sandbox Code Playgroud)

问题在于开始和结束列.它们只是打印为"无"

输出:

(u'Rt. 613N (Giles County)', None, None, u'Moderate', u'snow or ice', …
Run Code Online (Sandbox Code Playgroud)

python html-table mechanize beautifulsoup html-parsing

17
推荐指数
1
解决办法
2万
查看次数