use*_*529 1 beautifulsoup web-scraping pythonanywhere
I have a free account on PythonAnywhere from where I am trying to run the following script that locally works just fine.
I am wondering if the error I get is for technical reasons or just that PythonAnywhere forbids people to scrap from their platform for certain websites only?
Do you know of other free websites where I would be allowed to scrap anything?
import requests
from bs4 import BeautifulSoup as bs
def scrapMarketwatch(address):
#creating formatting data from scrapdata
r = requests.get(address)
c = r.content
sup = bs(c,"html.parser")
print(sup)
scrapMarketwatch('http://www.marketwatch.com/investing/future/sp%20500%20futures')
print('\n\n\n PARAGRAPH \n SPACE \n\n\n')
scrapMarketwatch('https://www.bloomberg.com/quote/USDJPY:CUR')
Run Code Online (Sandbox Code Playgroud)
I get the following error:
File "/usr/local/lib/python3.6/dist-packages/requests/packages/urllib3/util/retry.py", line 376, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) requests.packages.urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.bloomberg.com', port=443): Max retries exceeded with url: /quote/USDJPY:CUR (Caused by ProxyError('Cannot conn ect to proxy.', OSError('Tunnel connection failed: 403 Forbidden',))) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/sylvester83/scrapit/try2.py", line 20, in scrapMarketwatch('https://www.bloomberg.com/quote/USDJPY:CUR') File "/home/sylvester83/scrapit/try2.py", line 10, in scrapMarketwatch r = requests.get(address) File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 70, in get return request('get', url, params=params, **kwargs) File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 56, in request return session.request(method=method, url=url, **kwargs) File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 488, in request resp = self.send(prep, **send_kwargs) File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 609, in send r = adapter.send(request, **kwargs) File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 485, in send raise ProxyError(e, request=request) requests.exceptions.ProxyError: HTTPSConnectionPool(host='www.bloomberg.com', port=443): Max retries exceeded with url: /quote/USDJPY:CUR (Caused by ProxyError('Cannot connect to proxy.', OSEr ror('Tunnel connection failed: 403 Forbidden',)))
归档时间: |
|
查看次数: |
1369 次 |
最近记录: |