Ste*_*org 5 python asynchronous python-asyncio aiohttp
不删除,因为它有 httpx 的示例代码。
我正在尝试利用asyncio并行化几个长时间运行的 Web 请求。因为我是从requests库中迁移过来的,所以我想使用这个httpx库,因为它有类似的 API。我的环境是 Python 3.7.7 Anaconda 发行版,其中安装了所有必需的软件包(Windows 10)。
然而,尽管能够httpx用于同步 Web 请求(或用于串行执行一个接一个运行的异步请求),但我无法成功地一次运行多个异步请求,尽管使用aiohttp库很容易做到这一点.
这是在aiohttp: (请注意,我在 Jupyter 中运行,所以我已经有一个事件循环,因此缺少asyncio.run().
import aiohttp
import asyncio
import time
import httpx
async def call_url(session):
url = "https://services.cancerimagingarchive.net/services/v3/TCIA/query/getCollectionValues"
response = await session.request(method='GET', url=url)
#response.raise_for_status()
return response
for i in range(1,5):
start = time.time() # start time for timing event
async with aiohttp.ClientSession() as session: #use aiohttp
#async with httpx.AsyncClient as session: #use httpx
await asyncio.gather(*[call_url(session) for x in range(i)])
print(f'{i} call(s) in {time.time() - start} seconds')
Run Code Online (Sandbox Code Playgroud)
这会导致预期的响应时间配置文件:
1 call(s) in 7.9129478931427 seconds
2 call(s) in 8.876991510391235 seconds
3 call(s) in 9.730034589767456 seconds
4 call(s) in 10.630006313323975 seconds
Run Code Online (Sandbox Code Playgroud)
不过,如果我取消注释async with httpx.AsyncClient as session: #use httpx 并注释掉async with aiohttp.ClientSession() as session: #use aiohttp(在交换httpx了aiohttp),然后我得到以下错误:
AttributeError Traceback (most recent call last)
<ipython-input-108-25244245165a> in async-def-wrapper()
17 await asyncio.gather(*[call_url(session) for x in range(i)])
18 print(f'{i} call(s) in {time.time() - start} seconds')
AttributeError: __aexit__
Run Code Online (Sandbox Code Playgroud)
在我的在线研究中,我只能找到 Simon Hawe 的一篇 Medium 文章,展示了如何使用httpx并行请求。请参阅https://medium.com/swlh/how-to-boost-your-python-apps-using-httpx-and-asynchronous-calls-9cfe6f63d6ad
然而,示例异步代码甚至不使用异步会话对象,所以刚开始我有点怀疑。该代码不会在 Python 3.7.7 环境或 Jupyter 中执行。(代码在这里:https : //gist.githubusercontent.com/Shawe82/a218066975f4b325e026337806f8c781/raw/3cb492e971c13e76a07d1a1e77b48de94aa7229c/con)
它导致此错误:
Traceback (most recent call last):
File ".\async_http_test.py", line 24, in <module>
asyncio.run(download_all_photos('100_photos'))
File "C:\Users\stborg\AppData\Local\Continuum\anaconda3\envs\fastai2\lib\asyncio\runners.py", line 43, in run
return loop.run_until_complete(main)
File "C:\Users\stborg\AppData\Local\Continuum\anaconda3\envs\fastai2\lib\asyncio\base_events.py", line 587, in run_until_complete
return future.result()
File ".\async_http_test.py", line 16, in download_all_photos
resp = await httpx.get("https://jsonplaceholder.typicode.com/photos")
TypeError: object Response can't be used in 'await' expression
Run Code Online (Sandbox Code Playgroud)
我显然做错了什么,因为它httpx是为异步构建的。我只是不确定它是什么!
好的。坦率地说,这很尴尬。不需要解决方法。在问题陈述中,我完全忽略了调用 AsyncClient 构造函数......我不敢相信我错过了这么久。天啊...
要修复,只需将缺少的括号添加到 AsyncClient 构造函数:
async with httpx.AsyncClient() as session: #use httpx
await asyncio.gather(*[call_url(session) for x in range(i)])
Run Code Online (Sandbox Code Playgroud)
在进一步尝试写这个问题时,我发现上下文管理器的方式httpx和对待方式存在细微的差异。aiohttp
在引入问题的代码中,以下代码适用于aiohttp:
async with aiohttp.ClientSession() as session: #use aiohttp
await asyncio.gather(*[call_url(session) for x in range(i)])
Run Code Online (Sandbox Code Playgroud)
此代码将 ClientSession 上下文作为参数传递给该call_url方法。我假设asyncio.gather()完成后,资源将按照正常with语句进行清理。
httpx然而,如上所述,相同的方法失败了。然而,通过完全避免该with语句并手动关闭AsyncClient.
换句话说,替换
async with aiohttp.ClientSession() as session: #use aiohttp
await asyncio.gather(*[call_url(session) for x in range(i)])
Run Code Online (Sandbox Code Playgroud)
和
session = httpx.AsyncClient() #use httpx
await asyncio.gather(*[call_url(session) for x in range(i)])
await session.aclose()
Run Code Online (Sandbox Code Playgroud)
解决这个问题。
这是完整的工作代码:
import aiohttp
import asyncio
import time
import httpx
async def call_url(session):
url = "https://services.cancerimagingarchive.net/services/v3/TCIA/query/getCollectionValues"
response = await session.request(method='GET', url=url)
return response
for i in range(1,5):
start = time.time() # start time for timing event
#async with aiohttp.ClientSession() as session: #use aiohttp
session = httpx.AsyncClient() #use httpx
await asyncio.gather(*[call_url(session) for x in range(i)])
await session.aclose()
print(f'{i} call(s) in {time.time() - start} seconds')
Run Code Online (Sandbox Code Playgroud)