我正在尝试使用 reqwest 执行 HTTP GET 请求并将响应正文打印到 STDOUT。这适用于大多数网站,但它会为 amazon.com 返回奇怪的二进制输出:
#[tokio::main]
async fn main() {
run().await;
}
async fn run() {
let url = "https://www.amazon.com/PNY-GeForce-Gaming-Overclocked-Graphics/dp/B07GJ7TV8L/";
let resp = reqwest::get(url).await.unwrap();
let text = resp.text().await.unwrap();
println!("{}", text);
}
Run Code Online (Sandbox Code Playgroud)
为什么会resp.text().await.unwrap()返回二进制数据以及如何从中获取正常的HTTP正文?
curl 返回我期望的 HTML:
#[tokio::main]
async fn main() {
run().await;
}
async fn run() {
let url = "https://www.amazon.com/PNY-GeForce-Gaming-Overclocked-Graphics/dp/B07GJ7TV8L/";
let resp = reqwest::get(url).await.unwrap();
let text = resp.text().await.unwrap();
println!("{}", text);
}
Run Code Online (Sandbox Code Playgroud)
如果你这样做,curl https://www.amazon.com/PNY-GeForce-Gaming-Overclocked-Graphics/dp/B07GJ7TV8L/ - I你会看到:
server: Server
content-type: text/html
content-length: 2148
content-encoding: gzip
x-amz-rid: 2T9PBCY66S79SMC424V2
vary: Accept-Encoding
akamai-cache-status: Miss
date: Sat, 29 Feb 2020 22:23:54 GMT
Run Code Online (Sandbox Code Playgroud)
content-encoding: gzip你需要做什么是很明显的。从 reqwest结账gzip。gzip是一个可选功能,请参阅货物文档,对于 reqwest,您可以reqwest = { version = "0.10.3", features = ["gzip"] }在您的Cargo.toml.