我是Java的一个非常大的菜鸟,但我想尝试htmlunit.我使用netbeans作为我的IDE,我创建了一个项目文件夹"hu1".这是该文件夹的结构:
hu1
 > nbproject
 > src 
   > hu1
 > test
现在,我下载htmlunit 2.7并解压缩该文件夹,其中包含一个带有一堆jar文件的"lib"文件夹.我在哪里将lib文件夹放在我的netbeans项目文件夹中,以便我可以使用htmlunit?
此外,一旦我弄明白了,我将使用哪些路径进行导入.我在网上看到的很多例子都是这样的:
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.Page;
com.gargoylesoftware来自哪里?
我知道这是一个初学者的问题,我真的应该读一下如何用java编程,但我很感激专家的一些建议.
更新:这是我的设置图片.
显然,问题的答案取决于许多环境因素.
在一般的,我想知道是什么人的经验与HtmlUnitDriver作为一个可靠的工具,它可以是"信任"来浏览网站基本相同的方式其他浏览器做.
当然,我意识到"其他浏览器的做法"非常模糊; 自然每个浏览器都会有它的怪癖.但我正处于一个项目中,我们有数百个验收测试场景(用JBehave编写)并且使用FirefoxDriver和InternetExplorerDriver运行它们所需的时间超过两个小时,从持续集成的角度来看,这有点粗糙.所以我想知道我们是否可以将我们的验收测试转换为使用并且期望更快的时间与大多数相同的行为是至少可行的(也许我们可以预期一些测试失败使用并专门运行那些测试基于浏览器的驱动程序).HtmlUnitDriverHtmlUnitDriver
我们的UI使用GWT,这可能会或可能不会使事情复杂化(我不知道).
基本上,在其他人的体验中,它的HtmlUnitDriver运行方式和其他浏览器一样,或者它是否真的只适用于使用最少JavaScript的非常简单的HTML网站,不应该用于企业Web应用程序?
import java.io.IOException;
import java.net.MalformedURLException;
import java.util.List;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlAnchor;
import com.gargoylesoftware.htmlunit.html.HtmlButton;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
public class YoutubeBot {
private static final String YOUTUBE = "http://www.youtube.com";
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
    WebClient webClient = new WebClient();
    webClient.setThrowExceptionOnScriptError(false);
    // This is equivalent to typing youtube.com to the adress bar of browser
    HtmlPage currentPage = webClient.getPage("http://www.youtube.com/results?search_type=videos&search_query=official+music+video&search_sort=video_date_uploaded&suggested_categories=10%2C24&uni=3");
    // Get form where submit button is located
    HtmlForm searchForm = (HtmlForm) currentPage.getElementById("masthead-search");
    // Get the …我在C#中构建一个应用程序,它使用com.gargoylesoftware.htmlunit.WebClient来访问和检索来自网页的信息.
我的应用程序从主项目运行良好,但当我尝试构建单元测试来测试项目类时,我收到以下错误:
FactoryConfigurationError
Message "Provider com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl not found"
Source  "IKVM.OpenJDK.XML.API"  string
StackTrace  "   at javax.xml.parsers.DocumentBuilderFactory.newInstance()
at com.gargoylesoftware.htmlunit.javascript.configuration.JavaScriptConfiguration.loadConfiguration(Reader configurationReader)
at com.gargoylesoftware.htmlunit.javascript.configuration.JavaScriptConfiguration.loadConfiguration()
at com.gargoylesoftware.htmlunit.javascript.configuration.JavaScriptConfiguration..ctor(BrowserVersion )
at com.gargoylesoftware.htmlunit.javascript.configuration.JavaScriptConfiguration.getInstance(BrowserVersion browserVersion)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine..ctor(WebClient webClient)
at com.gargoylesoftware.htmlunit.WebClient.init(BrowserVersion , ProxyConfig )
at com.gargoylesoftware.htmlunit.WebClient..ctor(BrowserVersion browserVersion)
at com.gargoylesoftware.htmlunit.WebClient..ctor()
at GWT.HeadlessBrowser..ctor() in C:\\hg\\EXE\\GWT\\HeadlessBrowser.cs:line 57
at TestGWT.ProgramTest.TestLogInProcessForGWT() in C:\\hg\\TestGWT\\ProgramTest.cs:line 115"
尝试在单元测试类中创建HtmlUnit WebClient也会导致此错误.
我在主项目和包含单元测试的项目中都有项目引用htmlunit-2.7,IKVM.OpenJDK.Core和IKVM.OpenJDK.XML.API.
我是否需要额外的项目参考才能运行单元测试?可能导致此错误的原因是什么?
测试类使用Microsoft.VisualStudio.TestTools.UnitTesting;
我的项目包括htmlunit jar并下载一些页面内容.然而,可执行jar(包括libs,eclipse导出功能)只能在我创建它的机器上运行(在不同的情况下它不会执行).
编辑:它没有执行,因为它启动时不显示"启动无头浏览器"MessageBox.我使用了Eclipse Indigo:File> Export> Runnable jar> package所需的libratries到生成的jar中
帮助,众神:
import java.io.*;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.Page;
import com.gargoylesoftware.htmlunit.RefreshHandler;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.swing.*;
import javax.swing.filechooser.FileSystemView;
编辑:根据要求提供进一步的代码
public class MyTest
{
public static void main(String[] arguments) {
try{
JOptionPane.showMessageDialog(null, "Starting Headless Browser");
JFileChooser fr = new JFileChooser();
FileSystemView fw = fr.getFileSystemView();
String MyDocuments = fw.getDefaultDirectory().toString();
FileInputStream fstream = new FileInputStream(MyDocuments+"\\Links.txt");
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
String strLineID; …我有很多线程.每个线程都创建并使用它自己的WebClient(HtmlUnit框架).没有一个线程使用来自其他线程的WebClient实例.它是线程安全的吗?
我正在尝试点击此网站上的搜索按钮:
http://www.amadeusepower.com/trek/portals/trek/default.aspx?Culture=en-US
按钮在这里的某个地方
<table cellpadding="0" cellspacing="0" class="QuickSearchFormFlightModuleButtonsTable"
                width="100%">
                <tr>
                    <td class="cell1">
                        <a id="ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_hlFlightDetailedSearch" href="javascript:if(typeof notRedirectToTop == 'undefined'){document.forms[0].target = '_top';}__doPostBack('ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_hlFlightDetailedSearch', '');">Advanced options</a>
                    </td>
                    <td class="cell2">
                    </td>
                    <td class="cell3">
                    </td>
                    <td class="cell4">
                    </td>
                    <td class="cell5">
                        <script>DumpButtonHTML('ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_btnSearch','QuickSearchModuleFlightSearchStartSearchButton','QuickSearchModuleFlightSearchStartSearch','javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ctl00$ctl00$cph1$cph1$QuickSearchAll1$QuickFlightSearchControl1$ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_btnSearch_LinkButton", "", true, "", "", true, true));LockButton(this,\'\',true);Loading(IsValidForTableButton(\'\',true),\'DefaultSplash_SplashScreen\',\'/trek/App_Themes/trek_theme1/Templates/SplashScreens/\',\'ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_txtSearch_txtFrom;ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_txtSearch_txtTo;ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_txtDepartureDate_txtDate;ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_txtReturnDate_txtDate\');','True','Search','100px','True','','trek_theme1');</script>
                    </td>
                </tr>
            </table>
该按钮是站点左侧的搜索按钮.我使用HtmlUnitScripter附加组件为firefox生成一个类,但即使它生成了将填充表单的代码,它也不会生成将单击按钮的代码.
按下按钮后,会出现一个加载屏幕,然后显示结果.通常,下一个代码应该将结果页面返回到page变量中
HtmlElement theElement5 = (HtmlElement) page.getElementById("ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_btnSearch");
page = (HtmlPage) theElement5.click();
但它只返回填写表单的上一页.有没有一种特殊的方法来处理这个按钮,或者我找不到合适的按钮来点击?任何帮助将不胜感激.
编辑:
我使用时得到的例外
ScriptResult result = page.executeJavaScript("document.getElementById('ctl00_ctl00_ctl00_cph1_cph1_QuickSearchAll1_QuickFlightSearchControl1_btnSearch_Table').onclick()");
final Page newPage = result.getNewPage();
在下面
Exception in thread "main" ======= EXCEPTION START ========
EcmaError: lineNumber=[64] …我在 javascript更新后弄清楚如何获取某些HTML的内容时遇到了一些麻烦.
具体来说,我正试图从美国海军天文台主时钟获取当前时间.它有一个h1与元件ID的USNOclk,其中它显示当前时间.
首次加载页面时,此元素设置为显示"正在加载...",然后javascript启动并将其更新为当前时间
function showTime()
    {
        document.getElementById('USNOclk').innerHTML="Loading...<br />";
        xmlHttp=GetXmlHttpObject();
        if (xmlHttp==null){
            document.getElementById('USNOclk').innerHTML="Sorry, browser incapatible. <BR />";
            return;
        } 
        refresher = 0;
        startResponse = new Date().getTime();
        var url="http://tycho.usno.navy.mil/cgi-bin/time.pl?n="+ startResponse;
        xmlHttp.onreadystatechange=stateChanged;
        xmlHttp.open("GET",url,true);
        xmlHttp.send(null);
    }  
所以,问题是我不知道如何获得更新的时间.当我检查元素时,我看到"正在加载..."作为h1元素的内容.
我已经仔细检查了javascript是否已启用,并且我已经尝试调用该waitForBackgroundJavaScript函数webclient以及希望它能给javascript时间开始更新内容.但是,到目前为止还没有成功.
import com.gargoylesoftware.htmlunit._
import com.gargoylesoftware.htmlunit.html.HtmlPage
object AtomicTime {
  def main(args: Array[String]): Unit = {
    val url = "http://tycho.usno.navy.mil/what.html"
    val client = new WebClient(BrowserVersion.CHROME)
    println(client.isJavaScriptEnabled()) // returns true
    client.waitForBackgroundJavaScript(10000) …我在 Android Studio 项目中使用 htmlunit 2.36.0。我成功编译了 apk,但当我尝试获取网页时遇到一些运行时错误。之前,我收到以下错误:
java.lang.BootstrapMethodError: Exception from call site
但我可以通过在 gradle 中添加以下内容来解决这个问题:
compileOptions {
    sourceCompatibility JavaVersion.VERSION_1_8
    targetCompatibility JavaVersion.VERSION_1_8
}
但是,现在我面临另一个错误:
 java.lang.NoSuchFieldError: No static field INSTANCE of type Lorg/apache/http/conn/ssl/AllowAllHostnameVerifier; in class Lorg/apache/http/conn/ssl/AllowAllHostnameVerifier; or its superclasses (declaration of 'org.apache.http.conn.ssl.AllowAllHostnameVerifier' appears in /system/framework/framework.jar!classes3.dex)
        at org.apache.http.conn.ssl.SSLConnectionSocketFactory.<clinit>(SSLConnectionSocketFactory.java:151)
        at com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.buildSSLSocketFactory(HtmlUnitSSLConnectionSocketFactory.java:89)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.configureHttpsScheme(HttpWebConnection.java:635)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.createHttpClientBuilder(HttpWebConnection.java:558)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.getHttpClientBuilder(HttpWebConnection.java:519)
        at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:171)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1407)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1326)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:396)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:317)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:469)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:450)
显然,在它自己的类或其超类中没有AllowAllHostnameVerifier的静态字段实例。我不知道如何解决这个问题。
htmlunit ×10
java ×7
android ×2
ajax ×1
c# ×1
eclipse ×1
executable ×1
gwt ×1
html ×1
http-unit ×1
jar ×1
javascript ×1
reliability ×1
scala ×1
unit-testing ×1
webclient ×1
webdriver ×1
youtube ×1