我刚写了这个测试,看看我是不是疯了......
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using HtmlAgilityPack;
namespace HtmlAgilityPackFormBug
{
class Program
{
static void Main(string[] args)
{
var doc = new HtmlDocument();
doc.LoadHtml(@"
<!DOCTYPE html>
<html>
<head>
<title>Form Test</title>
</head>
<body>
<form>
<input type=""text"" />
<input type=""reset"" />
<input type=""submit"" />
</form>
</body>
</html>
");
var body = doc.DocumentNode.SelectSingleNode("//body");
foreach (var node in body.ChildNodes.Where(n => n.NodeType == HtmlNodeType.Element))
Console.WriteLine(node.XPath);
Console.ReadLine();
}
}
}
Run Code Online (Sandbox Code Playgroud)
它输出:
/html[1]/body[1]/form[1]
/html[1]/body[1]/input[1]
/html[1]/body[1]/input[2]
/html[1]/body[1]/input[3]
Run Code Online (Sandbox Code Playgroud)
但是,如果我改变<form>到<xxx>它给了我:
/html[1]/body[1]/xxx[1]
Run Code Online (Sandbox Code Playgroud)
(正如它应该).所以...看起来那些输入元素 …
示例HTML:
<html><body>
<form id="form1">
<input name="foo1" value="bar1" />
<!-- Other elements -->
</form>
<form id="form2">
<input name="foo2" value="bar2" />
<!-- Other elements -->
</form>
</body></html>
Run Code Online (Sandbox Code Playgroud)
测试代码:
HtmlDocument doc = new HtmlDocument();
doc.Load(@"D:\test.html");
foreach (HtmlNode node in doc.GetElementbyId("form2").SelectNodes(".//input"))
{
Console.WriteLine(node.Attributes["value"].Value);
}
Run Code Online (Sandbox Code Playgroud)
该声明doc.GetElementbyId("form2").SelectNodes(".//input")给了我一个空引用.
我做错了什么?谢谢.