如何在不同的选项卡/窗口中打开选择标记(下拉列表)的选项?

Abd*_*mac 7 python selenium web-scraping selenium-webdriver drop-down-menu

我正在尝试使用Python和Selenium来抓取这个网站,它要求您从下拉框中选择日期,然后单击搜索以查看规划应用程序.

网址:https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList.

我有代码工作,以选择下拉框的第一个索引,然后按搜索.如何在下拉框中为所有日期选项打开多个窗口,或逐个浏览它们以便我可以刮掉它?

from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options


options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome('/Users/weaabduljamac/Downloads/chromedriver', 
chrome_options=options)

url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList'
driver.get(url)

select = Select(driver.find_element_by_xpath('//*[@id="selWeek"]'))
select.select_by_index(1)

button = driver.find_element_by_id('csbtnSearch')
button.click()

app_numbers = driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text
print(app_numbers)
Run Code Online (Sandbox Code Playgroud)

下拉框HTML:

<select class="formitem" id="selWeek" name="selWeek">
   <option selected="selected" value="2018,31">Week commencing Monday 30 July 2018</option>
   <option value="2018,30">Week commencing Monday 23 July 2018</option>
   <option value="2018,29">Week commencing Monday 16 July 2018</option>
   <option value="2018,28">Week commencing Monday 9 July 2018</option>
   <option value="2018,27">Week commencing Monday 2 July 2018</option>
   <option value="2018,26">Week commencing Monday 25 June 2018</option>
   <option value="2018,25">Week commencing Monday 18 June 2018</option>
   <option value="2018,24">Week commencing Monday 11 June 2018</option>
   <option value="2018,23">Week commencing Monday 4 June 2018</option>
   <option value="2018,22">Week commencing Monday 28 May 2018</option>
</select>
Run Code Online (Sandbox Code Playgroud)

Deb*_*anB 3

根据您的问题,您将无法为不同的下拉选项打开多个窗口,因为标签<options>不包含任何href属性。他们将始终在同一个浏览器窗口中呈现新页面。

但是,要从下拉列表中选择日期,然后click() 搜索以查看规划应用程序,您可以使用以下解决方案:

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.support.ui import Select
    from selenium.webdriver.chrome.options import Options
    
    options = Options()
    options.add_argument('--headless')
    options.add_argument("start-maximized")
    options.add_argument('disable-infobars')
    driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    url = 'https://services.wiltshire.gov.uk/PlanningGIS/LLPG/WeeklyList'
    driver.get(url)
    
    select = Select(driver.find_element_by_xpath("//select[@class='formitem' and @id='selWeek']"))
    list_options = select.options
    for item in range(len(list_options)):
        select = Select(driver.find_element_by_xpath("//select[@class='formitem' and @id='selWeek']"))
        select.select_by_index(str(item))
        driver.find_element_by_css_selector("input.formbutton#csbtnSearch").click()
        print(driver.find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a').text)
        driver.get(url)
    driver.quit()
    
    Run Code Online (Sandbox Code Playgroud)
  • 控制台输出:

    18/06760/FUL
    18/07187/LBC
    18/06843/FUL
    18/06705/FUL
    18/06449/FUL
    18/05534/FUL
    18/06030/DEM
    18/05784/FUL
    18/05914/LBC
    18/05241/FUL
    
    Run Code Online (Sandbox Code Playgroud)

琐事

要抓取您需要替换的所有链接:

find_element_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a')
Run Code Online (Sandbox Code Playgroud)

和:

find_elements_by_xpath('//*[@id="form1"]/table/tbody/tr[1]/td[1]/a')
Run Code Online (Sandbox Code Playgroud)