使用aiohttp的Python lib美丽的汤

APi*_*ist 3 python aiohttp

有人知道怎么做:

import html5lib
import urllib
from bs4 import BeautifulSoup

soup = BeautifulSoup(urllib.request.urlopen('http://someWebSite.com').read().decode('utf-8'), 'html5lib')
Run Code Online (Sandbox Code Playgroud)

使用aiohttp而不是urllib?

谢谢^^

Yuv*_*uss 10

你可以这样做:

import asyncio
import aiohttp
import html5lib
from bs4 import BeautifulSoup

SELECTED_URL = 'http://someWebSite.com'

async def get_site_content():
    async with aiohttp.ClientSession() as session:
        async with session.get(SELECTED_URL) as resp:
            text = await resp.read()

    return BeautifulSoup(text.decode('utf-8'), 'html5lib')

loop = asyncio.get_event_loop()
sites_soup = loop.run_until_complete(get_site_content())
print(sites_soup)
loop.close()
Run Code Online (Sandbox Code Playgroud)


Dea*_*hik 8

仅适用于寻求更多答案的人:还有另一种在循环中运行同步代码的方法:loop.run_in_executor

更多文档:https : //docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor

示例代码:

import asyncio
import time

def blocking_func():
    time.sleep(5)
    return 42

async def main(loop):
    result = await loop.run_in_executor(None, blocking_func)
    return result

loop = asyncio.get_event_loop()
loop_result = loop.run_until_complete(main(loop))
print(loop_result) # => 42
Run Code Online (Sandbox Code Playgroud)

所以,你可以像使用协程一样等待你的任务