nis*_*mar 4 python selenium selenium-webdriver
我正在尝试从网页上抓取所有链接。我正在使用Selenium WebDriver滚动并单击网页中存在的“加载更多”按钮。我正在尝试的代码如下所示:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import ElementNotVisibleException
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from bs4 import BeautifulSoup
def fetch_links(url):
chrome_path = r"D:\nishant_pc_d_drive\nishant_pc\d_drive\update_engine\myntra_update\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get(url)
while True:
try:
scrollcount=1
while scrollcount<5:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
WebDriverWait(driver, 5)
scrollcount+=1
WebDriverWait(driver, 10).until(EC.presence_of_element_located(driver.find_elements_by_css_selector('.load_more .sbt-button, .load_more_order .sbt-button')))
driver.find_element_by_id("loadmore").click()
except (ElementNotVisibleException,NoSuchElementException) as e:
print "done"
x = driver.page_source
soup2 = BeautifulSoup(x, 'html.parser')
linkcount=0
for each in soup2.find_all('a',attrs={"class":"thumb searchUrlClass"}):
print "https://www.shoppersstop.com/"+each.get('href')
linkcount+=1
print linkcount
# thumb searchUrlClass
fetch_links("https://www.shoppersstop.com/women-westernwear-tops-tees/c-A206020")
Run Code Online (Sandbox Code Playgroud)
但不幸的是,它给了我一个错误,如下所示:
Traceback (most recent call last):
File "D:/INVENTORY/shopperstop/fetch_link.py", line 36, in <module>
fetch_links("https://www.shoppersstop.com/women-westernwear-tops-tees/c-A206020")
File "D:/INVENTORY/shopperstop/fetch_link.py", line 21, in fetch_links
WebDriverWait(driver, 10).until(EC.presence_of_element_located(driver.find_element_by_class_name('sbt-button')))
File "C:\Python27\lib\site-packages\selenium\webdriver\support\wait.py", line 71, in until
value = method(self._driver)
File "C:\Python27\lib\site-packages\selenium\webdriver\support\expected_conditions.py", line 63, in __call__
return _find_element(driver, self.locator)
File "C:\Python27\lib\site-packages\selenium\webdriver\support\expected_conditions.py", line 328, in _find_element
return driver.find_element(*by)
TypeError: find_element() argument after * must be an iterable, not WebElement
Run Code Online (Sandbox Code Playgroud)
如何解决此错误?谢谢!
错误文本确实令人困惑。
基本上,某些EC(预期条件)方法使用locators,而另一些使用elements。您使用的一个仅接受locator,但您提供了一个element。
所不同的是没有在解释硒教程,但它的排序中的硒API文档解释在这里:
element is WebElement object.
locator is a tuple of (by, path).
Run Code Online (Sandbox Code Playgroud)
这是一个实际的例子locator:
(By.ID, 'someid')
Run Code Online (Sandbox Code Playgroud)
因此,给定初始(错误)代码:
WebDriverWait(driver, 10).until(
EC.presence_of_element_located(driver.find_element_by_class_name('sbt-button'))
)
Run Code Online (Sandbox Code Playgroud)
...应更新为:
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CLASS_NAME, 'sbt-button'))
)
Run Code Online (Sandbox Code Playgroud)
注意双括号。那是一个元组被传递给EC方法。
注意:在您的情况下,看起来还需要多个元素,因此还需要使用EC.presence_of_all_elements_located()而不是EC.presence_of_element_located()。
from selenium.webdriver.common.by import By
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "myDynamicElement"))
)
Run Code Online (Sandbox Code Playgroud)