private void Extract(string url)
{
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(url);
foreach (HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]"))
{
}
}
Run Code Online (Sandbox Code Playgroud)
我想从html文件中提取/解析所有链接.但我得到一个错误:
错误8'HtmlAgilityPack.HtmlDocument'不包含'DocumentElement'的定义,并且没有可以找到接受类型'HtmlAgilityPack.HtmlDocument'的第一个参数的扩展方法'DocumentElement'(您是否缺少using指令或程序集引用?)
编辑**
我这样做了:
private void Extract(string url)
{
StreamWriter w = new StreamWriter(@"d:\localpath\test.txt");
HtmlWeb hw = new HtmlWeb();
HtmlDocument doc = hw.Load(url);
foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//a[@href]"))
{
w.WriteLine(link);
}
w.Close();
}
Run Code Online (Sandbox Code Playgroud)
并使用它:
Extract(@"d:\localpath\Sat24_Cloudsheight_Europe.html");
Run Code Online (Sandbox Code Playgroud)
但是我得到的是同一条线的很多次:
HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode HtmlAgilityPack.HtmlNode
如何写解析链接的文本文件?