解决缓慢的无头浏览器将 html、javascript 渲染到 python 中的屏幕截图图像的方法?

Far*_*zan 1 javascript python selenium python-3.x google-chrome-headless

我已经用 Flask 构建了一个 API,它使用python 中的chrome 无头驱动程序和selenium来呈现给定地址,其中包含一个带有一些 javascript 的简单 html 页面,并拍摄呈现页面的快照。部署到服务器后,请求花费的时间太长,因为无头浏览器必须对每个缓慢的请求执行。

是否有更快的方法来使用无头浏览器或替代品,可以获取请求的 html、javascript 并像 Python 中的浏览器一样呈现它以获得屏幕截图?

def create_screenshot(id):

options = Options()

options.add_argument('--headless')
options.add_argument('--disable-gpu')  # Last I checked this was necessary.
options.add_argument('--ignore-certificate-errors')
options.add_argument('--no-sandbox')
options.add_argument("--window-size=1920,1920")

driver = webdriver.Chrome('./chromedriver', chrome_options=options,
                        service_args=['--verbose', '--log-path=/tmp/chromedriver.log'])
driver.get("http://127.0.0.1:1234/snippet/{0}".format(id))

driver.maximize_window()

element = driver.find_element_by_id("snapArea")
location = element.location
size = element.size
x = location['x']
y = location['y']
width = location['x']+size['width']
height = location['y']+size['height']
png = driver.get_screenshot_as_png()

im = Image.open(BytesIO(png))
im = im.crop((int(x), int(y), int(width), int(height)))

path = create_folder(id)
im.save(path)
return path
Run Code Online (Sandbox Code Playgroud)

小智 5

与其每次都初始化驱动程序,为什么不在类中的开始创建它(即,作为名为“self.driver”的属性),然后在需要时调用它?像这样的东西:

class DriverContainer:
    def __init__(self):
        options = Options()

        options.add_argument('--headless')
        options.add_argument('--disable-gpu')
        options.add_argument('--ignore-certificate-errors')
        options.add_argument('--no-sandbox')
        options.add_argument("--window-size=1920,1920")

        self.driver = webdriver.Chrome('./chromedriver', chrome_options=options,
                        service_args=['--verbose', '--log-path=/tmp/chromedriver.log'])

    def take_screenshot(self, id):
        self.driver.get("http://127.0.0.1:1234/snippet/{0}".format(id))

        self.driver.maximize_window()
        element = self.driver.find_element_by_id("snapArea")
        location = element.location
        size = element.size
        x = location['x']
        y = location['y']
        width = location['x']+size['width']
        height = location['y']+size['height']
        png = self.driver.get_screenshot_as_png()

        im = Image.open(BytesIO(png))
        im = im.crop((int(x), int(y), int(width), int(height)))

        path = create_folder(id)
        im.save(path)
Run Code Online (Sandbox Code Playgroud)

然后你实例化它,driver_container = DriverContainer()driver_container.take_screenshot(id)在需要时执行。这样浏览器会在启动缓慢时跳过,根据我的经验,这是 selenium 最慢的地方。