我正在尝试抓取这个网站“https://www.ticketweb.com/search?q=”,但即使我可以在检查器中看到 HTML 元素并在通过 Python 请求时下载网页,我也只能得到那个错误。
这是我的脚本中的内容:
import requests
url_path = r'https://www.ticketweb.com/search?q='
HEADERS = {
"Accept": "*/*",
"Accept-Encoding": "utf-8",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36"
}
response = requests.get(url_path, headers=HEADERS)
content = response.text
print(content)
Run Code Online (Sandbox Code Playgroud)
这是回应:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>506 Invalid request</title>
</head>
<body>
<h1>Error 506 Invalid request</h1>
<p>Invalid request</p>
<h3>Error 54113</h3>
<p>Details: cache-dfw-kdfw8210093-DFW 1678372070 120734701</p>
<hr>
<p>Varnish cache server</p>
</body>
</html>
Run Code Online (Sandbox Code Playgroud)