小编xjm*_*fel的帖子

Splinter或Selenium:点击按钮后我们可以获得当前的html页面吗？

我正在尝试抓取网站" http://everydayhealth.com ".但是,我发现该页面将动态呈现.因此,当我点击"更多"按钮时,会显示一些新消息.但是,使用splinter单击按钮不会让"browser.html"自动更改为当前的html内容.有没有办法让它获得最新的html源,使用splinter或selenium？我在splinter中的代码如下:

import requests
from bs4 import BeautifulSoup
from splinter import Browser

browser = Browser()
browser.visit('http://everydayhealth.com')
browser.click_link_by_text("More")

print(browser.html)

Run Code Online (Sandbox Code Playgroud)

根据@Louis的回答,我重写了以下程序:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait

driver = webdriver.Firefox()
driver.get("http://www.everydayhealth.com")
more_xpath = '//a[@class="btn-more"]'
more_btn = WebDriverWait(driver, 10).until(lambda driver: driver.find_element_by_xpath(more_xpath))
more_btn.click()
more_news_xpath = '(//a[@href="http://www.everydayhealth.com/recipe-rehab/5-herbs-and-spices-to-intensify-flavor.aspx"])[2]'
WebDriverWait(driver, 5).until(lambda driver: driver.find_element_by_xpath(more_news_xpath))

print(driver.execute_script("return document.documentElement.outerHTML;"))
driver.quit()

Run Code Online (Sandbox Code Playgroud)

但是,在输出文本中,我仍然无法在更新的页面中找到该文本.例如,当我搜索"Is Milk Your Friend or Foe？"时,它仍然没有返回任何内容.有什么问题？

html python selenium web-crawler splinter

xjm*_*fel

2014 11-09

7
推荐指数

1
解决办法

6473
查看次数