我有一个看起来像这样的网站结构:
<div class='main_container'>
<div class='item_container'>
<div class='body'>
<span class='item_name'>Item 1</span>
<span class='item_desc'>Desc 1</span>
</div>
</div>
<div class='item_container'>
<div class='body'>
<span class='item_name'>Item 2</span>
<span class='item_desc'>Desc 2</span>
</div>
</div>
...
</div><!--End of main_container-->
//Note: Some divs might not have <span @class='item_name'>Item N</span> or other elements inside the item_container
Run Code Online (Sandbox Code Playgroud)
在HtmlUnit 1.14中如果我想获得所有项目名称:
List<HtmlDivision> divs = (List<HtmlDivision>)page.getByXPath("//div[@class='item_container']");
for(HtmlDivision div:divs){
String name = ((HtmlElement)div.getFirstByXPath("//span[@class='item_name']")).asText();
System.out.println(name);
}
Run Code Online (Sandbox Code Playgroud)
输出:
Item 1
Item 2
...
Run Code Online (Sandbox Code Playgroud)
但是在HtmlUnit 2.8中,当我做同样的事情时,我得到了.
Item 1
Item 1
...
Run Code Online (Sandbox Code Playgroud)
在HtmlUnit 2.8中是否有解决方法?