我正在尝试使用 Selenium 获取页面源代码，但我得到了空页面

Question

我正在尝试使用 Selenium 获取页面源代码，但我得到了空页面

Hne*_*nSH 4 java selenium

我正在尝试使用 Selenium 获取页面源代码，该代码是通用 SOP。它适用于Baidu.com 和example.com。但是当涉及到我实际需要的 URL 时，我得到了空页面。源代码只显示空标签，如下面的代码。我有什么遗漏的吗？

\n\n

我尝试添加更多选项参数，但似乎没有帮助

\n\n

WebDriver驱动程序；

\n\n

    System.setProperty("webdriver.chrome.driver", "E:\\\\applications\\\\ChromeDriver\\\\chromedriver_win32 (2)//chromedriver.exe");\n\n    // \xe5\xae\x9e\xe4\xbe\x8b\xe5\x8c\x96\xe4\xb8\x80\xe4\xb8\xaaWebDriver\xe7\x9a\x84\xe5\xaf\xb9\xe8\xb1\xa1    \xe4\xbd\x9c\xe7\x94\xa8\xef\xbc\x9a\xe5\x90\xaf\xe5\x8a\xa8\xe8\xb0\xb7\xe6\xad\x8c\xe6\xb5\x8f\xe8\xa7\x88\xe5\x99\xa8\n    driver = new ChromeDriver();\n\n    driver.manage().timeouts().implicitlyWait(2, TimeUnit.SECONDS);\n\n    driver.get("http://rd.huangpuqu.sh.cn/website/html/shprd/shprd_tpxw/List/list_0.htm");\n    String pageSource = driver.getPageSource();\n    String title = driver.getTitle();\n    System.out.println("==========="+title+"==============");\n    System.out.println(Jsoup.parse(pageSource)); \n

Run Code Online (Sandbox Code Playgroud)\n\n

我期望 URL 的解析页面源，以便我可以获得我需要的信息。但我被困在这里了。

\n

Answer 1

Adi*_*ana 5

使用 ChromeDriver 时，我可以重现该网站的问题。我发现有一个JS检测到您正在使用ChromeDriver并阻止对网页的请求，并显示400 HTTP错误代码：

现在，Firefox 可以使用以下代码按预期工作：

    FirefoxDriver driver = new FirefoxDriver();

    driver.get("http://rd.huangpuqu.sh.cn/website/html/shprd/shprd_tpxw/List/list_0.htm");
    Thread.sleep(5000);
    String pageSource = driver.getPageSource();
    String title = driver.getTitle();
    System.out.println("==========="+title+"==============");
    System.out.println(Jsoup.parse(pageSource));

    driver.quit();

Run Code Online (Sandbox Code Playgroud)

我只用了 5 秒的睡眠就有效了。最佳实践是等待页面中的特定元素，请检查此以供参考 -如何等待元素出现在 Selenium 中？

firefox浏览器版本：67.0.1 geckodriver 0.24.0 selenium版本：3.141.59

归档时间：	7 年前
查看次数：	2855 次
最近记录：	7 年前