PSr*_*raj -1 python beautifulsoup
我试图从Tripadvisor获取一些评分数据,但是当我试图获取数据时,我得到了
“ NoneType”对象不可下标
谁能帮我弄清楚我要去哪里错了,对不起,我对python很陌生。
这是我的示例代码
import requests
import re
from bs4 import BeautifulSoup
r = requests.get('http://www.tripadvisor.in/Hotels-g186338-London_England-Hotels.html')
data = r.text
soup = BeautifulSoup(data)
for rate in soup.find_all('div',{"class":"rating"}):
print (rate.img['alt'])
Run Code Online (Sandbox Code Playgroud)
输出如下:
4.5 of 5 stars
4.5 of 5 stars 4 of 5 stars
4.5 of 5 stars
4.5 of 5 stars 4 of 5 stars
4.5 of 5 stars
4.5 of 5 stars
4.5 of 5 stars Traceback (most recent call last):
File "<ipython-input-52-7460e8bfcb82>", line 3, in <module>
print (rate.img['alt'])
TypeError: 'NoneType' object is not subscriptable
Run Code Online (Sandbox Code Playgroud)
并非您的所有<div class="rating">标签都有<img />标签,所以rate.img也是如此None。
这些div看起来像这样:
<div class="rating">
<span class="rate">4.5 out of 5, </span>
<em>2,294 Reviews</em>
<br/>
<div class="posted">Last reviewed 25 Sep 2015</div>
</div>
Run Code Online (Sandbox Code Playgroud)
您可以对此进行测试:
if rate.img is not None:
# ...
Run Code Online (Sandbox Code Playgroud)
或div.rating使用CSS选择器仅选择标签下的图片:
for img in soup.select('div.rating img[alt]'):
Run Code Online (Sandbox Code Playgroud)
此处的选择器挑选出嵌套在标签内的<img/>带有alt属性的<div class="rating">标签。