小编Abd*_*mac的帖子

如何在不同的选项卡/窗口中打开选择标记(下拉列表)的选项?

我正在尝试使用Python和Selenium来抓取这个网站,它要求您从下拉框中选择日期,然后单击搜索以查看规划应用程序.

网址:https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList.

我有代码工作,以选择下拉框的第一个索引,然后按搜索.如何在下拉框中为所有日期选项打开多个窗口,或逐个浏览它们以便我可以刮掉它?

from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options


options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('/Users/weaabduljamac/Downloads/chromedriver', 
chrome_options=options)

url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList'
driver.get(url)

select = Select(driver.find_element_by_xpath('//*[@id="selWeek"]'))
select.select_by_index(1)

button = driver.find_element_by_id('csbtnSearch')
button.click()

app_numbers = driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text
print(app_numbers)
Run Code Online (Sandbox Code Playgroud)

下拉框HTML:

<select class="formitem" id="selWeek" name="selWeek">
   <option selected="selected" value="2018,31">Week commencing Monday 30 July 2018</option>
   <option value="2018,30">Week commencing Monday 23 July 2018</option>
   <option value="2018,29">Week commencing Monday 16 July 2018</option>
   <option value="2018,28">Week commencing Monday 9 July 2018</option>
   <option value="2018,27">Week …
Run Code Online (Sandbox Code Playgroud)

python selenium web-scraping selenium-webdriver drop-down-menu

7
推荐指数
1
解决办法
175
查看次数

在 python 中使用 selenium 进行分页导航

我正在使用 Python 和 Selenium 抓取这个网站。我有代码工作,但它目前只抓取第一页,我想遍历所有页面并将它们全部抓取,但它们以一种奇怪的方式处理分页,我将如何浏览页面并逐个抓取它们?

分页 HTML:

<div class="pagination">
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,1" title="Go to first page">First</a>
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,1" title="Go to previous page">Prev</a>
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,1" title="Go to page 1">1</a>
    <span class="current">2</span>
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,3" title="Go to page 3">3</a>
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,4" title="Go to page 4">4</a>
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,3" title="Go to next page">Next</a>
    <a href="/PlanningGIS/LLPG/WeeklyList/41826123,4" title="Go to last page">Last</a>
</div>
Run Code Online (Sandbox Code Playgroud)

刮刀:

import re
import json
import requests
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options

options = Options()
# options.add_argument('--headless')
options.add_argument("start-maximized")
options.add_argument('disable-infobars') …
Run Code Online (Sandbox Code Playgroud)

python selenium web-scraping selenium-webdriver

2
推荐指数
1
解决办法
1万
查看次数