我想在reddit上收集一些帖子标题来做分析。通过不断调试我的代码,我可以得到一些帖子的标题。突然我在尝试使用 PRAW 收集帖子时收到了 Forbidden 403。网上的解释是:“绝对禁止访问您试图访问的页面或资源。换句话说,403 错误意味着您无权访问您试图查看的任何内容”。请告诉我我该怎么做。谢谢
尝试添加一些标题并使用时间延迟
url="https://www.reddit.com"
my_headers=["Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html",
"Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Safari/605.1.15",
"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.64 Safari/537.31"
]
def get_content(url,headers):
randdom_header=random.choice(headers)
req=urllib.Request(url)
req.add_header("User-Agent",randdom_header)
req.add_header("Host","www.reddit.com")
req.add_header("Referer","https://www.reddit.com")
req.add_header("GET",url)
content=urllib.urlopen(req).read()
return content
print (get_content(url,my_headers))
Run Code Online (Sandbox Code Playgroud) 我想take_while在迭代器上使用,然后计算结果迭代器中有多少项。这是一个简单的玩具程序,演示了我正在尝试做的事情:
fn main() {
let v = vec![1, 2, 3, 4, 5, 4, 3];
let num_before_five = v.iter().take_while(|&&x| x != 5).len();
println!("There are {} items before 5 occurs.", num_before_five);
}
Run Code Online (Sandbox Code Playgroud)
当我尝试编译它时,出现以下错误:
error[E0599]: no method named `len` found for type `std::iter::TakeWhile<std::slice::Iter<'_, {integer}>, [closure@src/main.rs:3:47: 3:59]>` in the current scope
--> src/main.rs:3:61
|
3 | let num_before_five = v.iter().take_while(|&&x| x != 5).len();
| ^^^ method not found in `std::iter::TakeWhile<std::slice::Iter<'_, {integer}>, [closure@src/main.rs:3:47: 3:59]>`
Run Code Online (Sandbox Code Playgroud)
该错误表明 astd::iter::TakeWhile没有.len()方法,这是事实。虽然任意迭代器可能永远不会终止,但由于这个迭代器来自 a Vec …