che*_*nyf 5 perl6 web-scraping cro raku
我想要得到的内容https://translate.google.cn,然而,Cro::HTTP::Client和HTTP::UserAgent刚刚stucks,并WWW获取内容,我不知道为什么.如果我更改$url为https://perl6.org,所有三个模块都可以正常工作:
my $url = "https://translate.google.cn";
use Cro::HTTP::Client;
my $resp = await Cro::HTTP::Client.new(
headers => [
User-agent => 'Cro'
]
).get($url);
say await $resp.body-text();
use HTTP::UserAgent;
my $ua = HTTP::UserAgent.new;
$ua.timeout = 30;
my $response = $ua.get($url);
if $response.is-success {
say $response.content;
} else {
die $response.status-line;
}
)
use WWW;
say get($url)
Run Code Online (Sandbox Code Playgroud)
我错过了sonething吗?谢谢你的建议.
对我来说,HTTP::UserAgent工作和Cro::HTTP::Client卡住.如果您希望进一步调试,两个模块都有一个调试选项:
perl6 -MHTTP::UserAgent -e 'my $ua = HTTP::UserAgent.new(:debug); say $ua.get("https://translate.google.cn").content'
CRO_TRACE=1 perl6 -MCro::HTTP::Client -e 'my $ua = Cro::HTTP::Client.new(); say $ua.get("https://translate.google.cn").result.body-text.result'
WWW也适合我.令人惊讶的是,它适用于您,因为它受到支持HTTP::UserAgent(这对您不起作用).这是它的get方法,告诉你它如何使用HTTP::UserAgent:
sub get ($url, *%headers) is export(:DEFAULT, :extras) {
CATCH { .fail }
%headers<User-Agent> //= 'Rakudo WWW';
with HTTP::UserAgent.new.get: $url, |%headers {
.is-success or fail .&err;
.decoded-content
}
}
Run Code Online (Sandbox Code Playgroud)