客观的:
我正在尝试同时抓取多个网址。我不想同时发出太多请求,因此我使用此解决方案来限制它。
问题:
正在为所有任务发出请求,而不是一次针对有限数量的任务。
精简代码:
async def download_all_product_information():
# TO LIMIT THE NUMBER OF CONCURRENT REQUESTS
async def gather_with_concurrency(n, *tasks):
semaphore = asyncio.Semaphore(n)
async def sem_task(task):
async with semaphore:
return await task
return await asyncio.gather(*(sem_task(task) for task in tasks))
# FUNCTION TO ACTUALLY DOWNLOAD INFO
async def get_product_information(url_to_append):
url = 'https://www.amazon.com.br' + url_to_append
print('Product Information - Page ' + str(current_page_number) + ' for category ' + str(
category_index) + '/' + str(len(all_categories)) + ' in …
Run Code Online (Sandbox Code Playgroud)