grequest多个request.session池?

Mar*_*son 7 python sockets gevent grequests

我想为REST webserivce做很多url requets.通常在75-90k之间.但是,我需要限制与Web服务的并发连接数.

我开始以下面的方式玩弄问候,但很快就开始咀嚼打开的插座.

concurrent_limit = 30
urllist = buildUrls()
hdrs = {'Host' : 'hostserver'}
g_requests = (grequests.get(url, headers=hdrs) for url in urls)
g_responses = grequests.map(g_requests, size=concurrent_limit)
Run Code Online (Sandbox Code Playgroud)

由于运行了一分钟左右,我遇到了"达到最大插槽数"错误.据我所知,grequests中的每个requests.get调用都使用它自己的会话,这意味着为每个请求打开一个新的套接字.

我在github上找到了一条关于如何让grequest使用单个会话的注释.但这似乎有效地将所有请求瓶颈到一个共享池中.这似乎打败了异步http请求的目的.

s = requests.session()
rs = [grequests.get(url, session=s) for url in urls]
grequests.map(rs)
Run Code Online (Sandbox Code Playgroud)

是否可以以创建多个会话的方式使用grequests或gevent.Pool?

换句话说:如何通过排队或连接池来创建许多并发的http请求?

Mar*_*son 7

我最终没有使用grequests来解决我的问题.我仍然希望它有可能.

我用过线程:

class MyAwesomeThread(Thread):
    """
    Threading wrapper to handle counting and processing of tasks
    """
    def __init__(self, session, q):
        self.q = q
        self.count = 0
        self.session = session
        self.response = None
        Thread.__init__(self)

    def run(self): 
        """TASK RUN BY THREADING"""
        while True:
            url, host = self.q.get()
            httpHeaders = {'Host' : host}
            self.response = session.get(url, headers=httpHeaders)
            # handle response here
            self.count+= 1
            self.q.task_done()
        return

q=Queue()
threads = []
for i in range(CONCURRENT):
    session = requests.session()
    t=MyAwesomeThread(session,q)
    t.daemon=True # allows us to send an interrupt 
    threads.append(t)


## build urls and add them to the Queue
for url in buildurls():
    q.put_nowait((url,host))

## start the threads
for t in threads:
    t.start()
Run Code Online (Sandbox Code Playgroud)