python asyncio 和 httpx

ukh*_*han 3 python asynchronous python-asyncio httpx

我对异步编程非常陌生,我正在尝试使用 httpx。我有以下代码,我确信我做错了什么 - 只是不知道它是什么。有两种方法,一种是同步,另一种是异步。它们都来自谷歌金融。在我的系统上,我看到花费的时间如下:

异步:5.015218734741211
同步:5.173618316650391

这是代码:


import httpx
import asyncio
import time



#
#--------------------------------------------------------------------
#
#--------------------------------------------------------------------
#
def sync_pull(url):
  r = httpx.get(url)
  print(r.status_code)


#
#--------------------------------------------------------------------
#
#--------------------------------------------------------------------
#
async def async_pull(url):
  async with httpx.AsyncClient() as client:
    r = await client.get(url)
    print(r.status_code)


#
#--------------------------------------------------------------------
#
#--------------------------------------------------------------------
#
if __name__ == "__main__":

  goog_fin_nyse_url = 'https://www.google.com/finance/quote/'
  tickers = ['F', 'TWTR', 'CVX', 'VZ', 'GME', 'GM', 'PG', 'AAL', 
             'MARK', 'AAP', 'THO', 'NGD', 'ZSAN', 'SEAC',
             ]  

  print("Running asynchronously...")
  async_start = time.time()
  for ticker in tickers:
    url = goog_fin_nyse_url + ticker + ':NYSE'
    asyncio.run(async_pull(url))
  async_end = time.time()
  print(f"Time lapsed is: {async_end - async_start}")


  print("Running synchronously...")
  sync_start = time.time()
  for ticker in tickers:
    url = goog_fin_nyse_url + ticker + ':NYSE'
    sync_pull(url)
  sync_end = time.time()
  print(f"Time lapsed is: {sync_end - sync_start}")
Run Code Online (Sandbox Code Playgroud)

我曾希望异步方法所需的时间只是同步方法所需时间的一小部分。我究竟做错了什么?

Mat*_*ler 6

当您说asyncio.run(async_pull)运行“async_pull”并等待结果返回时。由于您在循环中对每个代码执行一次此操作,因此您实际上是在使用 asyncio 同步运行事物,并且不会看到性能优势。

您需要做的是创建多个异步调用并同时运行它们。有多种方法可以做到这一点,最简单的方法是使用asyncio.gather(参见https://docs.python.org/3/library/asyncio-task.html#asyncio.gather),它接受一系列协程并同时运行它们。调整代码相当简单,您创建一个异步函数来获取 URL 列表,然后调用async_pull每个 URL,然后将其传递给asyncio.gather并等待结果。对此进行调整的代码如下所示:

import httpx
import asyncio
import time

def sync_pull(url):
    r = httpx.get(url)
    print(r.status_code)

async def async_pull(url):
    async with httpx.AsyncClient() as client:
        r = await client.get(url)
        print(r.status_code)


async def async_pull_all(urls):
    return await asyncio.gather(*[async_pull(url) for url in urls])

if __name__ == "__main__":

    goog_fin_nyse_url = 'https://www.google.com/finance/quote/'
    tickers = ['F', 'TWTR', 'CVX', 'VZ', 'GME', 'GM', 'PG', 'AAL',
           'MARK', 'AAP', 'THO', 'NGD', 'ZSAN', 'SEAC',
           ]

    print("Running asynchronously...")
    async_start = time.time()
    results = asyncio.run(async_pull_all([goog_fin_nyse_url + ticker + ':NYSE' for ticker in tickers]))
    async_end = time.time()
    print(f"Time lapsed is: {async_end - async_start}")


    print("Running synchronously...")
    sync_start = time.time()
    for ticker in tickers:
        url = goog_fin_nyse_url + ticker + ':NYSE'
        sync_pull(url)
    sync_end = time.time()
    print(f"Time lapsed is: {sync_end - sync_start}")
Run Code Online (Sandbox Code Playgroud)

以这种方式运行,异步版本对我来说大约需要一秒,而同步版本则需要七秒。