我是HtmlUnit的新手,我正在尝试抓取一个使用Javascript编辑代码的网站。我听说HtmlUnit是最好的方法,因为它使用无头浏览器返回最终代码。
但是,正如您将看到的那样,我什至无法创建一个HtmlPage对象,而不会抛出一个巨大且无法理解的异常(至少考虑到我对HtmlUnit几乎为空的经验)。
这是我的代码:
import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
public class Main {
public static void main(String[] args) {
Main scraper = new Main();
scraper.testingGargoyle();
}
private void testingGargoyle() {
String myUrl = "https://www.wearvr.com/#game_id=game_4";
WebClient webClient = new WebClient();
try {
HtmlPage myPage = ((HtmlPage) webClient.getPage(myUrl));
} catch (FailingHttpStatusCodeException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Run Code Online (Sandbox Code Playgroud)
这是抛出的异常:
Apr 30, 2015 5:43:50 PM com.gargoylesoftware.htmlunit.IncorrectnessListenerImpl notify
WARNING: Obsolete content type encountered: 'application/x-javascript'.
Apr 30, …Run Code Online (Sandbox Code Playgroud) 我正在使用HtmlUnitDriver,这是我的代码.
HtmlUnitDriver driver = new HtmlUnitDriver(true);
driver.get("some url here");
Run Code Online (Sandbox Code Playgroud)
我得到以下例外:
Caused by: com.gargoylesoftware.htmlunit.ScriptException: Wrapped com.gargoylesoftware.htmlunit.ScriptException: SyntaxError: missing ; before statement (http://sales.liveperson.net/hcp/html/mTag.js?site=7824460#1(eval)#1)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:595)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:537)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:538)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:545)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:520)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:896)
at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeEventListeners(EventListenersContainer.java:162)
at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeBubblingListeners(EventListenersContainer.java:221)
at com.gargoylesoftware.htmlunit.javascript.host.Node.fireEvent(Node.java:735)
at com.gargoylesoftware.htmlunit.html.HtmlElement$2.run(HtmlElement.java:866)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:537)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:538)
at com.gargoylesoftware.htmlunit.html.HtmlElement.fireEvent(HtmlElement.java:871)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeEventHandlersIfNeeded(HtmlPage.java:1162)
at com.gargoylesoftware.htmlunit.html.HtmlPage.initialize(HtmlPage.java:202)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:440)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:311)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
at org.openqa.selenium.htmlunit.HtmlUnitDriver.get(HtmlUnitDriver.java:346)
... 8 more
Run Code Online (Sandbox Code Playgroud)
请帮我解决这个问题.