使用 selenium 滚动抓取 javascript 表

Jac*_*ack 1 python selenium

我正在尝试抓取通过 javascript 生成的表格,但我很挣扎。到目前为止我的代码是:

driver = webdriver.Chrome();

driver.get("https://af.ktnlandscapes.com/")

# get table -- first wait for table to fully load
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//*[@id='list-view']/tbody/tr")))
table = driver.find_element_by_xpath("//*[@id='list-view']")

# get rows
rows = table.find_elements_by_xpath("tbody/tr")

# iterate rows and get cells
for row in rows:

    # get cells
    print (row.get_attribute("listing"))

Run Code Online (Sandbox Code Playgroud)

我想刮掉表中的“listing=”数字。我不确定如何提取列表编号,并且我很难理解如何强制页面打开表中的其余行,因为它们仅在您向下滚动表时才加载。

我对这些列表号码感兴趣

小智 5

尝试使用下面的代码:

driver = webdriver.Chrome()
driver.get("https://af.ktnlandscapes.com/")

# get table -- first wait for table to fully load
WebDriverWait(driver, 10).until(EC.presence_of_all_elements_located((By.XPATH, "//*[@id='list-view']/tbody/tr")))
table = driver.find_element_by_xpath("//*[@id='list-view']")

get_number = 0
while True:
    count = get_number
    rows = table.find_elements_by_xpath("tbody/tr[@class='list-view-listing']")
    driver.execute_script("arguments[0].scrollIntoView();", rows[-1])  # scroll to last row
    get_number = len(rows)
    print(get_number)
    time.sleep(1)
    if get_number == count:
        break
Run Code Online (Sandbox Code Playgroud)

输出:

20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
339
339
Run Code Online (Sandbox Code Playgroud)

实际上是在 Web 控制台中查询了 339 行。 在此输入图像描述