3 html select xpath html-agility-pack
我正在尝试使用HtmlAgilityPack来解析HTML,但是遇到了问题.
示例HTML文档:
<tr>
<td class="css_lokalita" colspan="4">
<select id="region" name="region">
<option value="0" selected>Všetky regiony</option>
<optgroup>Banskobystrický kraj</optgroup>
<option value="k_1" style="color: #000000; font-weight:bold;">Banskobystrický kraj</option>
<option value="1"> Banská Bystrica</option>
.
.
.
<option value="174"> CZ - Ústecký kraj</option>
<option value="175"> CZ - Zlínský kraj</option>
</select>
</td>
</tr>
<tr>
<td class="css_sfotkou" colspan="4">
<input type="checkbox" name="foto" value="1" id="foto" />
<label for="foto">Iba používatelia s fotkou</label>
</td>
</tr>
<tr>
<td class="css_miestnost" colspan="4">
<select name="akt-miest" id="onoffaci">
<option value="a_0">Všetci</option>
.
.
.
<optgroup label="Zá?uby a záujmy">
<option value="m_1419307"> Bez Lásky</option>
.
.
.
<option value="m_1108016"> Drum N Bass</option>
</optgroup>
</select>
</td>
</tr>
Run Code Online (Sandbox Code Playgroud)
我需要解析值 <select name="akt-miest" id="onoffaci">
例如:
<option value="**a_0**">**Všetci**</option>
Run Code Online (Sandbox Code Playgroud)
我需要获得价值**a_0**和文字**Všetci**.
所以我尝试首先按ID进行选择:
var selectNode = htmlDoc.GetElementbyId("onoffaci");
Run Code Online (Sandbox Code Playgroud)
然后用Xpath选择所有选项节点.
var nodes = selectNode.SelectNodes("//option");
Run Code Online (Sandbox Code Playgroud)
获得价值:
foreach (var node in nodes)
{
string roomName = node.NextSibling.InnerText;
string roomId = node.Attributes["value"].Value;
rooms.Add(new Room { RoomId = roomId, RoomName = roomName });
}
Run Code Online (Sandbox Code Playgroud)
但我从另一个select(<select id="region" name="region">)获取值,这个select位于html代码的顶部.
编辑:
我应用Darin Dimitrov的建议试试这个:
HtmlNode selectNode = htmlDoc.GetElementbyId("onoffaci");
var nodes = selectNode.SelectNodes("option");
foreach (var node in nodes)
{
string roomName = node.NextSibling.InnerText;
string roomId = node.Attributes["value"].Value;
rooms.Add(new Room { RoomId = roomId, RoomName = roomName });
}
return rooms;
Run Code Online (Sandbox Code Playgroud)
我只解析前三个选项元素,因为我认为问题是选择编组
optgroup标签.
<select name="akt-miest" id="onoffaci">
<option value="a_0">Všetci</option>
<option value="a_1">Iba prihlásení</option>
<option value="a_5" selected="selected">Teraz na Pokeci</option>
<optgroup label="Hlavné miestnosti">
<option value="m_13"> Bez záväzkov</option>
<option value="m_9"> Do pohody</option>
<option value="m_39"> Dámsky klub</option>
</optgroup>
.
.
.
Run Code Online (Sandbox Code Playgroud)
我尝试选择以下所有节点
var nodes = selectNode.SelectNodes("option::*");
Run Code Online (Sandbox Code Playgroud)
但我得到这个错误: xpath has an invalid token.
我想访问selectNode的所有孩子:
HtmlNode selectNode = htmlDoc.GetElementbyId("onoffaci");
Run Code Online (Sandbox Code Playgroud)
编辑#2:
这是所有html文件,我需要解析选项标签.
Sim*_*ier 21
默认情况下,<OPTION>Html Agility Pack 将标记视为"空",这意味着它不需要关闭</OPTION>.在这种情况下,结束标记被丢弃.您可以使用HtmlNode.ElementFlags集合更改此行为.
这是一个应该做你想要的代码:
HtmlDocument doc = new HtmlDocument();
HtmlNode.ElementsFlags.Remove("option");
doc.LoadHtml(yourHtml);
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//select[@id='onoffaci']//option"))
{
Console.WriteLine("Value=" + node.Attributes["value"].Value);
Console.WriteLine("InnerText=" + node.InnerText);
Console.WriteLine();
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7518 次 |
| 最近记录: |