小编Aou*_*000的帖子

如何避免HTTP错误429(Too Many Requests)python

我正在尝试使用Python登录网站并从几个网页收集信息,我收到以下错误:

Traceback (most recent call last):
  File "extract_test.py", line 43, in <module>
    response=br.open(v)
  File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open
    return self._mech_open(url, data, timeout=timeout)
  File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 255, in _mech_open
    raise response
mechanize._response.httperror_seek_wrapper: HTTP Error 429: Unknown Response Code
Run Code Online (Sandbox Code Playgroud)

我用time.sleep()它并且它有效,但它似乎不聪明和不可靠,有没有其他方法来躲避这个错误?

这是我的代码:

import mechanize
import cookielib
import re
first=("example.com/page1")
second=("example.com/page2")
third=("example.com/page3")
fourth=("example.com/page4")
## I have seven URL's I want to open

urls_list=[first,second,third,fourth]

br = mechanize.Browser()
# Cookie Jar
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

# Browser options 
br.set_handle_equiv(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)

# …
Run Code Online (Sandbox Code Playgroud)

python http mechanize http-status-code-429

75
推荐指数
6
解决办法
19万
查看次数

如何使用itertools.groupby()获取每个项目的索引和出现

这是我有两个列表的故事:

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]
Run Code Online (Sandbox Code Playgroud)

我想在list_one中找到连续9的指示,以便我可以从list_two获取相应的字符串,我试过:

group_list_one= [(k, sum(1 for i in g),pdn.index(k)) for k,g in groupby(list_one)]
Run Code Online (Sandbox Code Playgroud)

我希望得到每个元组中前9个的索引,然后尝试从那里开始,但那不起作用..

我能在这做什么?PS:我看过itertools的文档,但对我来说似乎很模糊..提前感谢

编辑:预期的输出是(键,出现,index_of_first_occurance)类似的东西

[(9, 3, 2), (9, 4, 7)]
Run Code Online (Sandbox Code Playgroud)

python zip functional-programming list-comprehension python-itertools

4
推荐指数
1
解决办法
1630
查看次数

如何使用any函数检查变量是否与列表中的任何项匹配?

编辑:这是我想要做的:我要求用户输入一个月.那么代码将通过检查months_list中的每个项目来查找月份是否正确.如果没有找到,我希望他/她再次进入这个月..

这是代码:

months_list=["January", "February", "March", "April", "May", "June", "July"]
answer=raw_input("Month? \n")
while any(item.lower() != answer.lower() for item in months_list):
    print("Sorry, didn't recognize your answer, try again")
    answer=raw_input("Type in Month\n")
Run Code Online (Sandbox Code Playgroud)

然而,无论是否在列表中找到月份,这都会保持循环.我希望这是一个很好的澄清..谢谢大家提前

python python-2.x

3
推荐指数
3
解决办法
3万
查看次数