我正在尝试抓取通过 javascript 生成的表格,但我很挣扎。到目前为止我的代码是:
driver = webdriver.Chrome();
driver.get("https://af.ktnlandscapes.com/")
# get table -- first wait for table to fully load
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//*[@id='list-view']/tbody/tr")))
table = driver.find_element_by_xpath("//*[@id='list-view']")
# get rows
rows = table.find_elements_by_xpath("tbody/tr")
# iterate rows and get cells
for row in rows:
# get cells
print (row.get_attribute("listing"))
Run Code Online (Sandbox Code Playgroud)
我想刮掉表中的“listing=”数字。我不确定如何提取列表编号,并且我很难理解如何强制页面打开表中的其余行,因为它们仅在您向下滚动表时才加载。
小智 5
尝试使用下面的代码:
driver = webdriver.Chrome()
driver.get("https://af.ktnlandscapes.com/")
# get table -- first wait for table to fully load
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//*[@id='list-view']/tbody/tr")))
table = driver.find_element_by_xpath("//*[@id='list-view']")
get_number = 0
while True:
count = get_number
rows = table.find_elements_by_xpath("tbody/tr[@class='list-view-listing']")
driver.execute_script("arguments[0].scrollIntoView();", rows[-1]) # scroll to last row
get_number = len(rows)
print(get_number)
time.sleep(1)
if get_number == count:
break
Run Code Online (Sandbox Code Playgroud)
输出:
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
339
339
Run Code Online (Sandbox Code Playgroud)