Sto*_*ace 9 html python beautifulsoup html-parsing
我正试着用BS弄湿我的脚.我试图通过文档工作,但在我遇到问题的第一步.
这是我的代码:
from bs4 import BeautifulSoup
soup = BeautifulSoup('https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=5....1b&per_page=250&accuracy=1&has_geo=1&extras=geo,tags,views,description')
print(soup.prettify())
Run Code Online (Sandbox Code Playgroud)
这是我得到的回应:
Warning (from warnings module):
File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/bs4/__init__.py", line 189
'"%s" looks like a URL. Beautiful Soup is not an HTTP client. You should probably use an
HTTP client to get the document behind the URL, and feed that document to Beautiful Soup.' % markup)
UserWarning: "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=5...b&per_page=250&accuracy=1&has_geo=1&extras=geo,tags,views,description"
looks like a URL. Beautiful Soup is not an HTTP client. You should
probably use an HTTP client to get the document behind the URL, and feed that document
to Beautiful Soup.
https://api.flickr.com/services/rest/?method=flickr.photos.search&api;_key=5...b&per;_page=250&accuracy;=1&has;_geo=1&extras;=geo,tags,views,description
Run Code Online (Sandbox Code Playgroud)
是因为我试着打电话给http**s**还是另一个问题?谢谢你的帮助!
ale*_*cxe 13
您将URL作为字符串传递.相反,您需要通过urllib2或获取页面源requests:
from urllib2 import urlopen # for Python 3: from urllib.request import urlopen
from bs4 import BeautifulSoup
URL = 'https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=5....1b&per_page=250&accuracy=1&has_geo=1&extras=geo,tags,views,description'
soup = BeautifulSoup(urlopen(URL))
Run Code Online (Sandbox Code Playgroud)
请注意,您不需要调用read()结果urlopen(),BeautifulSoup允许第一个参数成为类文件对象,urlopen()返回类似文件的对象.
| 归档时间: |
|
| 查看次数: |
9750 次 |
| 最近记录: |