如何使用asyncio在Python 3中异步运行requests.get?

xbo*_*und 5 python-3.x python-requests python-asyncio

我正在尝试创建简单的Web监控脚本,该脚本定期和异步地将GET请求发送到列表中的URL.这是我的请求功能:

def request(url,timeout=10):
    try:
        response = requests.get(url,timeout=timeout)
        response_time = response.elapsed.total_seconds()
        if response.status_code in (404,500):
            response.raise_for_status()
        html_response = response.text
        soup = BeautifulSoup(html_response,'lxml')
        # process page here
        logger.info("OK {}. Response time: {} seconds".format(url,response_time))
    except requests.exceptions.ConnectionError:
        logger.error('Connection error. {} is down. Response time: {} seconds'.format(url,response_time))
    except requests.exceptions.Timeout:
        logger.error('Timeout. {} not responding. Response time: {} seconds'.format(url,response_time))
    except requests.exceptions.HTTPError:
        logger.error('HTTP Error. {} returned status code {}. Response time: {} seconds'.format(url,response.status_code, response_time))
    except requests.exceptions.TooManyRedirects:
        logger.error('Too many redirects for {}. Response time: {} seconds'.format(url,response_time))
    except:
        logger.error('Content requirement not found for {}. Response time: {} seconds'.format(url,response_time))
Run Code Online (Sandbox Code Playgroud)

在这里我为所有网址调用此函数:

def async_requests(delay,urls):
    for url in urls:
        async_task = make_async(request,delay,url,10)
        loop.call_soon(delay,async_task)
    try:
        loop.run_forever()
    finally:
        loop.close()
Run Code Online (Sandbox Code Playgroud)

delay参数是循环的间隔,它描述了需要执行函数的频率.为了循环,request我创建了这样的东西:

def make_async(func,delay,*args,**kwargs):

    def wrapper(*args, **kwargs):
        func(*args, **kwargs)
        loop.call_soon(delay, wrapper)

    return wrapper
Run Code Online (Sandbox Code Playgroud)

每次执行时async_requests我都会为每个url收到此错误:

Exception in callback 1.0(<function mak...x7f1d48dd1730>)
handle: <Handle 1.0(<function mak...x7f1d48dd1730>)>
Traceback (most recent call last):
  File "/usr/lib/python3.5/asyncio/events.py", line 125, in _run
    self._callback(*self._args)
TypeError: 'float' object is not callable
Run Code Online (Sandbox Code Playgroud)

此外request,每个网址的功能也不会按预期定期执行.此外我的打印功能也async_requests没有执行:

async_requests(args.delay,urls)
print("Starting...")
Run Code Online (Sandbox Code Playgroud)

我知道我在代码中做错了但我无法弄清楚如何解决这个问题.我是python的初学者,对asyncio不是很有经验.总结我想要达成的目标:

  • request不阻塞主线程的情况下,针对特定URL 异步和周期性运行.
  • async_requests异步运行,所以我可以在同一个线程中启动一个简单的http服务器.

Mik*_*mov 8

except:
Run Code Online (Sandbox Code Playgroud)

它还会捕获服务异常行KeyboardInterruptStopIteration.永远不要这样做.而是写:

except Exception:
Run Code Online (Sandbox Code Playgroud)

如何使用asyncio在Python 3中异步运行requests.get?

requests.get 本质上是封锁的.

您应该为aiohttp模块之类的请求找到异步替代方法:

async def get(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as resp:
            return await resp.text()
Run Code Online (Sandbox Code Playgroud)

或者requests.get在单独的线程中运行并使用loop.run_in_executor以下方法等待此线程的异步性:

executor = ThreadPoolExecutor(2)

async def get(url):
    response = await loop.run_in_executor(executor, requests.get, url)
    return response.text
Run Code Online (Sandbox Code Playgroud)