Python获取<title>

Question

Python获取<title>

xin*_*ron 5 python urllib2

我想获取使用urllib2打开的网页标题.这样做的最佳方法是什么,解析html并找到我需要的东西(现在只有-tag,但将来可能还需要更多).

为此目的是否有一个很好的解析库？

Answer 1

Rob*_*bbR 9

是的我会推荐BeautifulSoup

如果您获得了标题,那就简单地说:

soup = BeautifulSoup(html)
myTitle = soup.html.head.title

Run Code Online (Sandbox Code Playgroud)

要么

myTitle = soup('title')

Run Code Online (Sandbox Code Playgroud)

取自文档

它非常强大,无论多么混乱都会解析HTML.

Answer 2

Dom*_*ger 5

尝试美丽的汤:

url = 'http://www.example.com'
response = urllib2.urlopen(url)
html = response.read()

soup = BeautifulSoup(html)
title = soup.html.head.title
print title.contents

Run Code Online (Sandbox Code Playgroud)

归档时间：	16 年，7 月前
查看次数：	5469 次
最近记录：	11 年，6 月前