我原本计划为 Sneakersnstuff.com 网站创建一个基本的网络抓取工具,但由于一个错误,我的努力提前停止了。当请求 URL https://www.sneakersnstuff.com/时,我没有显示网站的 html,甚至没有显示入口验证码,而是被重定向到带有错误消息“启用 cookie”的 cloudflare 页面。我的代码和响应如下所示
import requests
import cfscrape
session = requests.session()
response = session.get('https://www.sneakersnstuff.com/')
print(response.headers)
Run Code Online (Sandbox Code Playgroud)
<!DOCTYPE html>
<!--[if lt IE 7]> <html class="no-js ie6 oldie" lang="en-US"> <![endif]-->
<!--[if IE 7]> <html class="no-js ie7 oldie" lang="en-US"> <![endif]-->
<!--[if IE 8]> <html class="no-js ie8 oldie" lang="en-US"> <![endif]-->
<!--[if gt IE 8]><!-->
<html class="no-js" lang="en-US">
<!--<![endif]-->
<head>
<title>Access denied | www.sneakersnstuff.com used Cloudflare to restrict access</title>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" …Run Code Online (Sandbox Code Playgroud)