为什么HTML Agility Pack HtmlDocument.DocumentNode为空?

ahm*_*iee 5 c# asp.net html-agility-pack

我正在使用此代码来更改HTML流的href属性.

首先我使用以下代码下载完整的html页面:( URL是网页地址)

HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse myHttpWebResponse = 
                         (HttpWebResponse)myHttpWebRequest.GetResponse();

Stream s = myHttpWebResponse.GetResponseStream();
Run Code Online (Sandbox Code Playgroud)

然后我处理这个:

HtmlDocument doc = new HtmlDocument();

doc.Load(s);
foreach (HtmlNode link in doc.DocumentNode.SelectNodes("/a"))
{
    string att = link.Attributes["href"].Value;
    link.Attributes["href"].Value = "http://ahmadalli.somee.com/default.aspx?url=" + att;
}
doc.Save(s);
Run Code Online (Sandbox Code Playgroud)

s 是HTML流.

但我有一个例外,说doc.DocumentNode是空的!

我试过很多网站,但是doc.DocumentNode没有

L.B*_*L.B 7

这适合我.

using(WebClient client = new WebClient())
{
    client.Encoding = System.Text.Encoding.UTF8;
    var doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml(client.DownloadString("http://www.google.com?q=stackoverflow"));
    foreach (var href in doc.DocumentNode.Descendants("a").Select(x => x.Attributes["href"]))
    {
        if (href == null) continue;
        href.Value = "http://ahmadalli.somee.com/default.aspx?url=" + HttpUtility.UrlEncode(href.Value);
    }
    StringWriter writer = new StringWriter();
    doc.Save(writer);
    var finalHtml = writer.ToString();
}
Run Code Online (Sandbox Code Playgroud)

另请参阅HttpUtility.UrlEncode 能够正确获取网址.否则,原始URL中的某些参数可能会导致问题.

使用HttpUtility.UrlDecode进行解码.