小编Ami*_*mar的帖子

如何加速 Python 中的异步请求

我想从网站下载/抓取 5000 万条日志记录。我没有一次性下载 5000 万个，而是尝试使用以下代码一次下载 1000 万个，但它一次只能处理 20,000 个（超过这个数量会引发错误），因此它变得非常耗时下载那么多数据。目前下载20000条记录的速度需要3-4分钟，100%|\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88\xe2\x96\x88| 20000/20000 [03:48<00:00, 87.41it/s]那么如何加速呢？

import asyncio\nimport aiohttp\nimport time\nimport tqdm\nimport nest_asyncio\n\nnest_asyncio.apply()\n\n\nasync def make_numbers(numbers, _numbers):\n    for i in range(numbers, _numbers):\n        yield i\n\n\nn = 0\nq = 10000000\n\n\nasync def fetch():\n    # example\n    url = "https://httpbin.org/anything/log?id="\n\n    async with aiohttp.ClientSession() as session:\n        post_tasks = []\n        # prepare the coroutines that poat\n        async for x in make_numbers(n, q):\n            post_tasks.append(do_get(session, url, x))\n        # now execute them all at once\n\n        responses = [await f for f in tqdm.tqdm(asyncio.as_completed(post_tasks), …

Run Code Online (Sandbox Code Playgroud)

python asynchronous python-3.x python-asyncio aiohttp

Ami*_*mar

2022 03-02

8
推荐指数

1
解决办法

9903
查看次数