相关疑难解决方法(0)

import grequests

def print_res(res):
    from pprint import pprint
    pprint (vars(res))

req = grequests.get('http://www.codehenge.net/blog', hooks=dict(response=print_res))
res = grequests.map([req])

for i in range(10):
    print i

Run Code Online (Sandbox Code Playgroud)

上面的代码将产生以下输出:

<...large HTTP response output...>

0
1
2
3
4
5
6
7
8
9

Run Code Online (Sandbox Code Playgroud)

grequests.map()调用显然会阻塞,直到HTTP响应可用.我似乎错误地理解了这里的"异步"行为,而grequest库只是用于同时执行多个HTTP请求并将所有响应发送到单个回调.这准确吗？

python gevent python-requests grequests

cac*_*ois

2018 09-15

39
推荐指数

2
解决办法

3万
查看次数

如何启用请求异步模式？

对于此代码:

import sys

import gevent
from gevent import monkey

monkey.patch_all()

import requests
import urllib2

def worker(url, use_urllib2=False):
    if use_urllib2:
        content = urllib2.urlopen(url).read().lower()
    else:
        content = requests.get(url, prefetch=True).content.lower()
    title = content.split('<title>')[1].split('</title>')[0].strip()

urls = ['http://www.mail.ru']*5

def by_requests():
    jobs = [gevent.spawn(worker, url) for url in urls]
    gevent.joinall(jobs)

def by_urllib2():
    jobs = [gevent.spawn(worker, url, True) for url in urls]
    gevent.joinall(jobs)

if __name__=='__main__':
    from timeit import Timer
    t = Timer(stmt="by_requests()", setup="from __main__ import by_requests")  
    print 'by requests: %s seconds'%t.timeit(number=3)
    t = Timer(stmt="by_urllib2()", setup="from __main__ import …

Run Code Online (Sandbox Code Playgroud)

python asynchronous urllib2 gevent python-requests

use*_*901

2012 11-25

17
推荐指数

3
解决办法

2万
查看次数

有效地与请求异步下载文件

我想用python尽可能快地下载文件.这是我的代码

import pandas as pd
import requests
from requests_futures.sessions import FuturesSession
import os
import pathlib
from timeit import default_timer as timer


class AsyncDownloader:
    """Download files asynchronously"""

    __urls = set()
    __dest_path = None
    __user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0'
    __read_timeout = 60
    __connection_timeout = 30
    __download_count = 0  # unlimited
    # http://www.browserscope.org/?category=network
    __worker_count = 17  # No of threads to spawn
    __chunk_size = 1024
    __download_time = -1
    __errors = []

    # TODO Fetch only content of …

Run Code Online (Sandbox Code Playgroud)

python performance python-3.x python-requests requests-futures

use*_*277

2018 02-06

14
推荐指数

2
解决办法

2894
查看次数

通过Python发送多个HTTP请求的理想方法？

可能重复:
与urllib2或其他http库的多个(异步)连接？

我正在开发一个运行Python代码的Linux Web服务器,以通过第三方API从HTTP获取实时数据.数据被放入MySQL数据库.我需要对很多URL进行大量查询,我需要快速完成(更快=更好).目前我正在使用urllib3作为我的HTTP库.最好的方法是什么？我应该生成多个线程(如果是,有多少？)并且每个查询都有不同的URL？我很想听听你对此的看法 - 谢谢!

python concurrency http httprequest

use*_*786

2017 05-23

10
推荐指数

1
解决办法

3万
查看次数

Python请求非阻塞？

可能重复:
Python请求的异步请求

python模块请求是非阻塞的吗？我没有在文档中看到有关阻止或非阻塞的任何内容.

如果它是阻塞的,你会建议哪个模块？

python python-requests

Jef*_*eff

2017 05-23

9
推荐指数

1
解决办法

2万
查看次数

具有超时,最大大小和连接池的http请求

我正在寻找Python(2.7)中的一种方法来执行具有3个要求的HTTP请求:

超时(可靠性)
内容最大尺寸(安全性)
连接池(用于性能)

我已经检查了所有python HTTP库,但它们都不符合我的要求.例如:

urllib2:很好,但没有汇集

import urllib2
import json

r = urllib2.urlopen('https://github.com/timeline.json', timeout=5)
content = r.read(100+1)
if len(content) > 100: 
    print 'too large'
    r.close()
else:
    print json.loads(content)

r = urllib2.urlopen('https://github.com/timeline.json', timeout=5)
content = r.read(100000+1)
if len(content) > 100000: 
    print 'too large'
    r.close()
else:
    print json.loads(content)

Run Code Online (Sandbox Code Playgroud)

请求:没有最大尺寸

import requests
r = requests.get('https://github.com/timeline.json', timeout=5, stream=True)
r.headers['content-length'] # does not exists for this request, and not safe
content = r.raw.read(100000+1)
print content # ARF this is gzipped, so not the real …

Run Code Online (Sandbox Code Playgroud)

python timeout connection-pooling http max-size

Aur*_*ert

2014 05-07

8
推荐指数

1
解决办法

1万
查看次数

使用Requests Python包的钩子问题

我正在使用该模块requests,当我开始使用钩子时,我收到了这条消息.

File "/Library/Python/2.7/site-packages/requests-1.1.0-py2.7.egg/requests/sessions.py", line 321, in request
resp = self.send(prep, **send_kwargs)

File "/Library/Python/2.7/site-packages/requests-1.1.0-py2.7.egg/requests/sessions.py", line 426, in send
r = dispatch_hook('response', hooks, r, **kwargs)

File "/Library/Python/2.7/site-packages/requests-1.1.0-py2.7.egg/requests/hooks.py", line 41, in dispatch_hook
_hook_data = hook(hook_data, **kwargs)
TypeError: hook() got an unexpected keyword argument 'verify'

Run Code Online (Sandbox Code Playgroud)

这是我的代码(简化):

import requests
def hook(r):
     print r.json()

r = requests.get("http://search.twitter.com/search.json?q=blue%20angels&rpp=5", hooks=dict(response=hook))

Run Code Online (Sandbox Code Playgroud)

python request typeerror python-2.7 python-3.x

mas*_*cat

lucky-day

6
推荐指数

1
解决办法

4576
查看次数

如何在 django 中执行异步任务？

假设我需要请求多个服务器做出响应

def view_or_viewset(request):

  d1 = request_a_server() # something like requests.get(url, data)
  d2 = request_b_server()
  d3 = request_c_server()

  d4 = do_something_with(d3)

  return Response({"foo1": d1, "foo2": d2, "foo3": d3, "foo4": d4})

Run Code Online (Sandbox Code Playgroud)

我正在为每个请求执行同步请求，我想一定有更好的方法来处理这种情况。

（如果任务很长，我会使用 celery，但事实并非如此，仍然执行多个同步请求似乎不对）

处理这个问题的推荐范例（？）是什么？