打印网页后无头 Chrome Webdriver 问题

Question

打印网页后无头 Chrome Webdriver 问题

Tki*_*ver 1 python selenium python-3.x selenium-webdriver

我有一个程序可以访问 Google，并将页面保存为 PDF（在目录中保存为 Python 文件）。太好了，但我不想打开 Chrome 窗口。通过Google搜索，我发现我可以使用options.headless = True. 但是当我将其放入代码中后，它不会打印页面。我该如何解决这个问题？代码如下：

from selenium import webdriver
import json
import os

options = webdriver.ChromeOptions()
options.headless = False  # Setting this to True won't make the page printing work

options.add_argument("--kiosk-printing")

settings = {
    "recentDestinations": [{
        "id": "Save as PDF",
        "origin": "local",
        "account": ""
    }],
    "selectedDestinationId": "Save as PDF",
    "version": 2,

}

prefs = {'printing.print_preview_sticky_settings.appState': json.dumps(settings),
         "savefile.default_directory": str(os.path.realpath('.')),
         }

options.add_experimental_option('prefs', prefs)

driver = webdriver.Chrome(executable_path="chromedriver.exe", options=options)

driver.get("https://google.com")
driver.execute_script("window.print();")

Run Code Online (Sandbox Code Playgroud)

感谢您的帮助！

〜你好世界

Answer 1

PDH*_*ide 6

注意：无头浏览器不支持首选项

截至 2021 年 3 月

https://bugs.chromium.org/p/chromedriver/issues/detail?id=1925

Headless chrome 不支持首选项设置。

你可以做的事情是：

import subprocess

mycmd = r'"C:\Program Files\Google\Chrome\Application\chrome.exe" --headless "https://www.google.com" --print-to-pdf="C:\Users\prave\Downloads\travelBA\test\delete\a.pdf"'
subprocess.run(mycmd)

Run Code Online (Sandbox Code Playgroud)

这会将 google 的 pdf 保存为指定路径中的 a.pdf。但这是一次性操作。

推荐方法：

使用 chrome 开发工具协议（PDF 创建仅适用于无头模式）：

import json
from base64 import b64decode
from selenium.webdriver.common.keys import Keys
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

from datetime import datetime

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(options=options)
driver.get("https://emicalculator.net/")

a = driver.find_element_by_css_selector("#loanamountslider")
webdriver.ActionChains(driver).click(
    a).click_and_hold().move_by_offset(0, 0).perform()

a = driver.execute_cdp_cmd(
    "Page.printToPDF", {"path": 'html-page.pdf', "format": 'A4'})
print(a)
# Import only b64decode function from the base64 module

# Define the Base64 string of the PDF file
b64 = a['data']

# Decode the Base64 string, making sure that it contains only valid characters
bytes = b64decode(b64, validate=True)

# Perform a basic validation to make sure that the result is a valid PDF file
# Be aware! The magic number (file signature) is not 100% reliable solution to validate PDF files
# Moreover, if you get Base64 from an untrusted source, you must sanitize the PDF contents
if bytes[0:4] != b'%PDF':
    raise ValueError('Missing the PDF file signature')

# Write the PDF contents to a local file
f = open('file.pdf', 'wb')
f.write(bytes)
f.close()

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，11 月前
查看次数：	2020 次
最近记录：	4 年，6 月前