Hau*_*ter 4 python selenium screen-scraping python-3.x webdriverwait
我正在尝试为特定站点构建代理抓取工具,但无法转到下一页。
这是我正在使用的代码。
如果您回答了我的问题,请向我解释一下您使用的内容,如果可以,请向我解释有关此类代码的任何好的教程,请提供一些:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
#options.headless = True #for headless
#options.add_argument('--disable-gpu') #for headless and os win
driver = webdriver.Chrome(options=options)
driver.get("https://hidemyna.me/en/proxy-list/")
time.sleep(10) #bypass cloudflare
tbody = driver.find_element_by_tag_name("tbody")
cell = tbody.find_elements_by_tag_name("tr")
for column in cell:
column = column.text.split(" ")
print (column[0]+":"+ column[1]) #ip and port
nxt = driver.find_element_by_class_name('arrow_right')
nxt.click()
Run Code Online (Sandbox Code Playgroud)
要移至下一页,您可以尝试以下解决方案:
代码块:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, WebDriverException
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://hidemyna.me/en/proxy-list/')
while True:
try:
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[@class='arrow__right']/a"))))
driver.find_element_by_xpath("//li[@class='arrow__right']/a").click()
print("Navigating to Next Page")
except (TimeoutException, WebDriverException) as e:
print("Last page reached")
break
driver.quit()
Run Code Online (Sandbox Code Playgroud)控制台输出:
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
.
.
.
Navigating to Next Page
Last page reached
Run Code Online (Sandbox Code Playgroud)| 归档时间: |
|
| 查看次数: |
10460 次 |
| 最近记录: |