标签: htmlunit

是否可以从字符串加载HtmlPage?

我已将网页的HTML存储在数据库中.

我想利用HtmlUnit查找/引用DOM元素的能力.

是否可以从字符串(通过数据库列)加载HtmlPage对象?

java htmlunit

4
推荐指数
1
解决办法
3426
查看次数

Xpath用href标签中的匹配文本获取第二个url

一个html页面有分页链接,1页面设置在页面顶部,另一个页面位于页面底部.

使用HtmlUnit,我目前正在使用页面上的HtmlAnchor getByAnchorText("1");

顶部的一些链接存在问题,因此我想使用XPath引用底部链接.

nextPageAnchor = (HtmlAnchor) page.getByXPath("");
Run Code Online (Sandbox Code Playgroud)

如何使用xpath引用页面上的第二个链接?

我需要使用AnchorText引用链接,所以链接如下:

<a href="....">33</a>
Run Code Online (Sandbox Code Playgroud)

href有随机文本,是一个javascript函数,所以我不知道它会是什么.

xpath有可能吗?

java xpath htmlunit

4
推荐指数
1
解决办法
4141
查看次数

如何在Web应用程序中测试上下文菜单功能?

我正在玩一个带有上下文菜单的grails应用程序(右键单击).上下文菜单是使用Chris Domigan的jquery contextmenu插件构建的.

虽然上下文实际上有效,但我想进行自动化测试,而我无法确定如何做到这一点.

  • 我试过Selenium 2.05a(即Webdriver),但是没有rightClick方法.
  • 我注意到HtmlUnit有一个rightclick方法,但我似乎无法在点击之前和之后检测到DOM之间的任何差异.

selenium automated-tests contextmenu webdriver htmlunit

4
推荐指数
1
解决办法
5637
查看次数

如何使用XPath获取元素节点之间的原子值

我想只选择节点内的原子值.例如,以下"here"文本:

<a href="">here</a>
Run Code Online (Sandbox Code Playgroud)

当我在Java中使用Xpath时,它会返回某种对象/数组,例如

[DomNode[<a href="">here</a>]]
Run Code Online (Sandbox Code Playgroud)

我只想要文本.

这有可能,怎么样?谢谢!

java xpath htmlunit

4
推荐指数
1
解决办法
3335
查看次数

为什么HTMLunit不能在这个https网页上运行?

我正在尝试更多地了解HTMLunit并进行一些测试.我正在尝试从此站点获取页面标题和文本等基本信息:

https://....com(删除了完整的网址,重要的是它是https)

我使用的代码就是这个,在其他网站上运行正常:

 final WebClient webClient = new WebClient();
  final HtmlPage page;
  page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
  System.out.println(page.getTitleText());
  System.out.println(page.asText());
Run Code Online (Sandbox Code Playgroud)

为什么我不能获得这些基本信息?如果是因为安全措施,具体是什么,我可以绕过它们吗?谢谢.

编辑:嗯,代码在webclient.getpage()之后停止工作; ,test2没有写.所以我无法检查页面是否为空.

  final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_2);
  final HtmlPage page;
  System.out.println("test1");
    try {
        page = (HtmlPage)webClient.getPage("https://medeczane.sgk.gov.tr/eczane/login.jsp");
      System.out.println("test2");
Run Code Online (Sandbox Code Playgroud)

java security screen-scraping htmlunit

4
推荐指数
1
解决办法
8300
查看次数

如何在HtmlUnit中通过Xpath获取元素

我正在尝试搜索亚马逊.我想选择类别,例如.书籍,键入一些搜索条件,例如.java并单击Go按钮.我的问题是单击Go按钮.我有例外:

线程"main"中的异常java.lang.IndexOutOfBoundsException:索引:0,大小:0,java.util.ArrayList.rangeCheck(ArrayList.java:571),位于java.util.ArrayList.get(ArrayList.java:349)at Bot.main中的Bot.clickSubmitButton(Bot.java:77)(Bot.java:111)

这是我的代码:

/**
 * @author ivan.bisevac
 */

import java.io.IOException;
import java.net.MalformedURLException;

import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlImageInput;
import com.gargoylesoftware.htmlunit.html.HtmlInput;
import com.gargoylesoftware.htmlunit.html.HtmlOption;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlSelect;
import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;

public class Bot {
    private HtmlPage currentPage;

    public HtmlPage getCurrentPage() {
        return currentPage;
    }

    public Bot() {

    }

    /**
     * Bot constructor
     * 
     * @param pageAddress
     *            Address to go.
     * @throws IOException
     * @throws MalformedURLException
     * @throws FailingHttpStatusCodeException
     */
    public Bot(String pageAddress) throws FailingHttpStatusCodeException,
            MalformedURLException, IOException {
        this();
        this.goToAddress(pageAddress);
    } …
Run Code Online (Sandbox Code Playgroud)

java xpath htmlunit

4
推荐指数
1
解决办法
1万
查看次数

HtmlUnit网页状态代码

我正在尝试获取给定页面的Web状态.但是当它出现404错误时,页面不会返回状态代码,而是抛出错误.

int status= webClient.getPage("website").getWebResponse().getStatusCode();
System.out.println( status);
Run Code Online (Sandbox Code Playgroud)

有任何想法吗?

我希望看到网站何时超时,但是出于测试目的,我错误地想要网站的网址,看看我是否能看到404.

java htmlunit http-status-code-404

4
推荐指数
1
解决办法
4738
查看次数

HtmlUnit异常

我无法理解此HTMLUnit异常的含义.当我在网页上的链接上调用click()时会发生这种情况.

Exception class=[net.sourceforge.htmlunit.corejs.javascript.WrappedException]
com.gargoylesoftware.htmlunit.ScriptException: Wrapped com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "offsetWidth" from null (http://webapps6.doc.state.nc.us/opi/scripts/DHTMLmessages.js#95) (javascript url#297)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:534)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:537)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:538)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:432)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:407)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:965)
at com.gargoylesoftware.htmlunit.html.HtmlAnchor.doClickAction(HtmlAnchor.java:87)
at com.gargoylesoftware.htmlunit.html.HtmlAnchor.doClickAction(HtmlAnchor.java:121)
at com.gargoylesoftware.htmlunit.html.HtmlElement.click(HtmlElement.java:1329)
at com.gargoylesoftware.htmlunit.html.HtmlElement.click(HtmlElement.java:1288)
at com.gargoylesoftware.htmlunit.html.HtmlElement.click(HtmlElement.java:1257)
at testapp.TestApp.main(TestApp.java:61)
Caused by: net.sourceforge.htmlunit.corejs.javascript.WrappedException: Wrapped com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "offsetWidth" from null (http://webapps6.doc.state.nc.us.js#95) (javascript url#297)
at net.sourceforge.htmlunit.corejs.javascript.Context.throwAsScriptRuntimeEx(Context.java:1802)
at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:196)
at net.sourceforge.htmlunit.corejs.javascript.FunctionObject.call(FunctionObject.java:479)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1701)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:854)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:164)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:429)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:267)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3183)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:175)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$5.doRun(JavaScriptEngine.java:423)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:528)
... …
Run Code Online (Sandbox Code Playgroud)

java htmlunit

4
推荐指数
1
解决办法
7576
查看次数

无法使HTMLUnit跟随使用__doPostBack()函数的页面上的链接

我正在尝试从单击一个链接的__doPostBack函数的ASP页中抓取数据。当我单击()具有HTMLUnit的链接时,它将返回我从其开始的页面。我需要怎么做才能完成回发并返回下一页?

码:

import java.util.List;

import com.gargoylesoftware.htmlunit.ScriptResult;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.html.HtmlAnchor;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

public class ScrapperApp {

    private static void go() throws Exception {
        /* turn off annoying htmlunit warnings */
        java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(java.util.logging.Level.OFF);

        HtmlPage nextPage;
        ScriptResult onClick; 

        String url = "http://media.ethics.ga.gov/search/Campaign/Campaign_Name.aspx?NameID=5751&FilerID=C2009000085&Type=candidate";

        final WebClient webclient = new WebClient(BrowserVersion.CHROME_16);
        final HtmlPage page = webclient.getPage(url);

        System.out.println("PULLING LINKS:");

        List<HtmlAnchor> articles = (List<HtmlAnchor>) page.getByXPath("//table[@id='ctl00_ContentPlaceHolder1_Name_Reports1_TabContainer1_TabPanel1_dgReports']/tbody/tr/td/a[@class='lblentrylink']");

        for(int x=0; x<articles.size(); x++) {
            System.out.println("Clicking "+x+": "+articles.get(x).asText()); 
            nextPage = articles.get(x).click();
            System.out.println(nextPage.getUrl());
        }
    }

    public static void main(String[] args) throws Exception …
Run Code Online (Sandbox Code Playgroud)

java asp.net postback htmlunit

4
推荐指数
1
解决办法
1613
查看次数

Java –如何使用HtmlUnit登录网站?

我正在编写一个Java程序来登录学校用来发布成绩的网站。

这是登录表单的网址:https : //ma-andover.myfollett.com/aspen/logon.do

这是登录表单的HTML:

<form name="logonForm" method="post" action="/aspen/logon.do" autocomplete="off"><div><input type="hidden" name="org.apache.struts.taglib.html.TOKEN" value="30883f4c7e25a014d0446b5251aebd9a"></div>
<input type="hidden" id="userEvent" name="userEvent" value="930">
<input type="hidden" id="userParam" name="userParam" value="">
<input type="hidden" id="operationId" name="operationId" value="">
<input type="hidden" id="deploymentId" name="deploymentId" value="ma-andover">
<input type="hidden" id="scrollX" name="scrollX" value="0">
<input type="hidden" id="scrollY" name="scrollY" value="0">
<input type="hidden" id="formFocusField" name="formFocusField" value="username">
<input type="hidden" name="mobile" value="false">
<input type="hidden" name="SSOLoginDone" value="">
<center>
<img src="images/spacer.gif" height="15" width="1">

<script language="JavaScript">
document.forms[0].elements['deploymentId'].value = 'ma-andover';
</script>

<script language="JavaScript">
$(function()
{
$('form').attr('autocomplete', 'off');
var name = $('#username');
var password …
Run Code Online (Sandbox Code Playgroud)

java passwords login htmlunit web

4
推荐指数
1
解决办法
7859
查看次数