Selenium:等到WebElement中的文本发生变化

Win*_*ags 7 python selenium selenium-webdriver

我正在使用seleniumPython 2.7.从网页上的搜索框中检索内容.搜索框动态检索并在框中显示结果.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
import re
from time import sleep

driver = webdriver.Firefox()
driver.get(url)

df = pd.read_csv("read.csv")

def crawl(isin):
    searchkey = driver.find_element_by_name("searchkey")
    searchkey.clear()
    searchkey.send_keys(isin)
    sleep(11)

    search_result = driver.find_element_by_class_name("ac_results")
    names = re.match(r"^.*(?=(\())", search_result.text).group().encode("utf-8")
    product_id = re.findall(r"((?<=\()[0-9]*)", search_result.text)
    return pd.Series([product_id, names])

df[["insref", "name"]] = df["ISIN"].apply(crawl)

print df
Run Code Online (Sandbox Code Playgroud)

代码的相关部分可以在下面找到 def crawl(isin):

  • 程序在搜索框中输入要搜索的内容(该框名称为 searchkey).
  • 然后它sleep()会等待内容显示在搜索框下拉字段中ac_results.
  • 然后得到两个变量insrefsnamesRegex.

而不是调用sleep(),我希望它等待WebElement中的内容ac_results加载.

由于它将通过从列表中输入新的搜索词来连续使用搜索框来获取新数据,因此可以使用正则表达式来识别何时存在ac_results与先前内容不同的新内容.

有这个方法吗?请务必注意,搜索框中的内容是动态加载的,因此该函数必须识别WebElement中的某些内容已更改.

ale*_*cxe 14

您需要应用显式等待概念.例如,等待元素变为可见:

wait = WebDriverWait(driver, 10)
wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'searchbox')))
Run Code Online (Sandbox Code Playgroud)

在这里,它将等待最多10秒钟,每隔500毫秒检查元素的可见性.

有一组内置的预期条件要等待,并且您也可以轻松编写自定义的预期条件.


仅供参考,这是我们在聊天中集思广益后接近它的方式.我们引入了一个自定义的预期条件,它将等待元素文本的更改.它帮助我们确定新搜索结果何时出现:

import re

import pandas as pd
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import _find_element

class text_to_change(object):
    def __init__(self, locator, text):
        self.locator = locator
        self.text = text

    def __call__(self, driver):
        actual_text = _find_element(driver, self.locator).text
        return actual_text != self.text

#Load URL
driver = webdriver.Firefox()
driver.get(url)

#Load DataFrame of terms to search for
df = pd.read_csv("searchkey.csv")

#Crawling function    
def crawl(searchkey):
    try: 
        text_before = driver.find_element_by_class_name("ac_results").text 
    except NoSuchElementException: 
        text_before = ""

    searchbox = driver.find_element_by_name("searchbox")
    searchbox.clear()
    searchbox.send_keys(searchkey)
    print "\nSearching for %s ..." % searchkey

    WebDriverWait(driver, 10).until(
        text_to_change((By.CLASS_NAME, "ac_results"), text_before)
    )

    search_result = driver.find_element_by_class_name("ac_results")
    if search_result.text != "none":
        names = re.match(r"^.*(?=(\())", search_result.text).group().encode("utf-8")
        insrefs = re.findall(r"((?<=\()[0-9]*)", search_result.text)
    if search_result.text == "none":
        names = re.match(r"^.*(?=(\())", search_result.text)
        insrefs = re.findall(r"((?<=\()[0-9]*)", search_result.text)
    return pd.Series([insrefs, names])

#Run crawl    
df[["Insref", "Name"]] = df["ISIN"].apply(crawl)

#Print DataFrame    
print df
Run Code Online (Sandbox Code Playgroud)