Alv*_*zah 4 python web-scraping selenium-chromedriver selenium-webdriver
我已经使用 Selenium 和 ChromeDriver 编写了一个 Python 脚本来抓取数据。该脚本浏览多个页面并单击各种按钮来检索数据。但是,我遇到了以下错误:
WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"message":"No node with given id found"}
Run Code Online (Sandbox Code Playgroud)
该错误似乎发生在迭代中的特定点,而不是随机的。我已尝试解决该问题,但不确定导致该问题的原因或如何解决。
我在 Windows 10 计算机上使用Python3.10.5 和版本 113.0.5672.63Selenium的库。ChromeDriver任何解决此问题的帮助将不胜感激。
我还是一个初学者,这是我第一次尝试硒。我尝试添加time.sleep(1)以确保网页已加载,检查元素的可见性,并且元素可单击,但问题仍然出现。
这是我当前编写的脚本
url = '.../'
path = Service(r'...\chromedriver_win32')
options = Options()
options.add_experimental_option("debuggerAddress", "localhost:9222")
driver = webdriver.Chrome(service=path, options=options)
driver.get(url)
wait = WebDriverWait(driver, 10)
def scrape_left_table(prob, kab, kec):
data = []
rows = driver.find_elements(By.CSS_SELECTOR, 'div:nth-child(1) > table > tbody > tr')
for row in rows:
wilayah = row.find_element(By.CSS_SELECTOR, 'td.text-xs-left.wilayah-name > button').text
persentasi = row.find_element(By.CSS_SELECTOR, 'td.text-xs-left.wilayah-name > span').text
class_1= row.find_element(By.CSS_SELECTOR, 'td:nth-child(2)').text
class_2= row.find_element(By.CSS_SELECTOR, 'td:nth-child(3)').text
data.append([prob, kab, kec, wilayah, persentasi, class_1, class_2])
return data
def scrape_right_table(prob, kab, kec):
data = []
rows = driver.find_elements(By.CSS_SELECTOR, 'div:nth-child(2) > table > tbody > tr')
for row in rows:
wilayah = row.find_element(By.CSS_SELECTOR, 'td.text-xs-left.wilayah-name > button').text
persentasi = row.find_element(By.CSS_SELECTOR, 'td.text-xs-left.wilayah-name > span').text
class_1= row.find_element(By.CSS_SELECTOR, 'td:nth-child(2)').text
class_2= row.find_element(By.CSS_SELECTOR, 'td:nth-child(3)').text
data.append([prob, kab, kec, wilayah, persentasi, class_1, class_2])
return data
data = []
provinsi = driver.find_elements(By.CSS_SELECTOR, 'div:nth-child(1) > table > tbody > tr')
button = provinsi[1].find_element(By.TAG_NAME, 'button')
pro = button.text
wait.until(EC.element_to_be_clickable(button)).click()
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div:nth-child(1) > table > tbody > tr')))
for i in [1,2]:
time.sleep(1)
kabupaten = driver.find_elements(By.CSS_SELECTOR, 'div:nth-child(' + str(i) + ') > table > tbody > tr')
for kab in kabupaten:
time.sleep(1)
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div:nth-child(' + str(i) + ') > table > tbody > tr')))
kab_button = kab.find_element(By.TAG_NAME, 'button')
kab_name = kab_button.text
driver.execute_script("arguments[0].scrollIntoView();", kab_button)
driver.execute_script("arguments[0].click();", kab_button)
for i in [1,2]:
time.sleep(1)
kecamatan = driver.find_elements(By.CSS_SELECTOR, 'div:nth-child(' + str(i) + ') > table > tbody > tr')
for kec in kecamatan:
wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div:nth-child(' + str(i) + ') > table > tbody > tr')))
kec_button = kec.find_element(By.CSS_SELECTOR, 'td.text-xs-left.wilayah-name > button')
kec_name = kec_button.text
driver.execute_script("arguments[0].scrollIntoView();", kec_button)
driver.execute_script("arguments[0].click();", kec_button)
kelurahan = driver.find_elements(By.CSS_SELECTOR, 'div:nth-child(1) > table > tbody > tr')
time.sleep(1)
left_table = scrape_left_table(pro, kab_name, kec_name)
right_table = scrape_right_table(pro, kab_name, kec_name)
data += left_table + right_table
back = driver.find_element(By.CSS_SELECTOR, '#app > div.sticky-top.bg-white > div > div:nth-child(2) > div > div > div > div:nth-child(5) > div > div > div.vs__actions > button')
driver.execute_script("arguments[0].scrollIntoView();", back)
driver.execute_script("arguments[0].click();", back)
back = driver.find_element(By.CSS_SELECTOR, '#app > div.sticky-top.bg-white > div > div:nth-child(2) > div > div > div > div:nth-child(4) > div > div > div.vs__actions > button')
driver.execute_script("arguments[0].scrollIntoView();", back)
driver.execute_script("arguments[0].click();", back)
Run Code Online (Sandbox Code Playgroud)
在某个迭代之后,即provinsi[0]在 689 次迭代之后发生错误,provinsi[1]在 35 次迭代之后发生错误。
WebDriverException Traceback (most recent call last)
c:\...\web_scraping.ipynb Cell 4 in ()
23 for kec in kecamatan:
24 wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, 'div:nth-child(' + str(i) + ') > table > tbody > tr')))
---> 26 kec_button = kec.find_element(By.CSS_SELECTOR, 'td.text-xs-left.wilayah-name > button')
27 kec_name = kec_button.text
28 driver.execute_script("arguments[0].scrollIntoView();", kec_button)
WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"message":"No node with given id found"}
Run Code Online (Sandbox Code Playgroud)
Ory*_*orm 10
这似乎是最近 ChromeDriver v113 的一个缺陷: https://bugs.chromium.org/p/chromedriver/issues/detail ?id=4440
目前看来,这是最有可能的嫌疑:
当 Chromedriver 确定与之交互的元素已过时时,就会发生这种情况
看起来由于缺陷,WebDriver在这种情况下抛出WebDriverException而不是StaleElementReferenceException (这是在 C# 中)。