no *_*ein 11 python selenium webdriver selenium-webdriver pageloadstrategy
在页面完全加载之前,如何在元素上点击selenium并抓取数据?我的互联网连接非常糟糕,因此有时需要永久地加载页面,无论如何都在这周围?
Deb*_*anB 15
当您eager在这种情况下提到问题时,我们可以获取属性的帮助click on elements and scrape data before the page has fully loaded.当Selenium默认加载页面/网址时,它遵循pageLoadStrategy设置为的默认配置pageLoadStrategy.Selenium可以从不同的代码开始执行下一行代码normal.目前Selenium支持3种不同的Document readiness state我们可以通过Document readiness state如下配置:
pageLoadStrategy (未定义)none (页面变得互动)eager (完整页面加载)以下是配置以下内容的代码块normal:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
binary = r'C:\Program Files\Mozilla Firefox\firefox.exe'
caps = DesiredCapabilities().FIREFOX
# caps["pageLoadStrategy"] = "normal" # complete
caps["pageLoadStrategy"] = "eager" # interactive
# caps["pageLoadStrategy"] = "none" # undefined
driver = webdriver.Firefox(capabilities=caps, firefox_binary=binary, executable_path="C:\\Utility\\BrowserDrivers\\geckodriver.exe")
driver.get("https://google.com")
Run Code Online (Sandbox Code Playgroud)
对于 Chromedriver,它的工作原理与@DebanjanB 的答案相同,但是尚不支持“急切”页面加载策略
所以对于 chromedriver 你得到:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().CHROME
# caps["pageLoadStrategy"] = "normal" # Waits for full page load
caps["pageLoadStrategy"] = "none" # Do not wait for full page load
driver = webdriver.Chrome(desired_capabilities=caps, executable_path="path/to/chromedriver.exe")
Run Code Online (Sandbox Code Playgroud)
请注意,当使用“无”策略时,您很可能必须实现自己的等待方法来检查您需要的元素是否已加载。
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
WebDriverWait(driver, timeout=10).until(
ec.visibility_of_element_located((By.ID, "your_element_id"))
)
Run Code Online (Sandbox Code Playgroud)
现在您可以在页面完全加载之前开始与您的元素进行交互!
| 归档时间: |
|
| 查看次数: |
6246 次 |
| 最近记录: |