从HttpWebRequest请求的网页中省略图像

Jay*_*oot 1 .net c# httpwebrequest http-headers

我提取网页以便将数据提供给我的应用程序.但是,这些页面包含许多我根本不需要的图像.我只需要文本数据.我的问题是网络请求花费了不可接受的时间.我认为这些图片也是在网络请求期间获取的.有没有办法消除图像并只下载文本数据?

以下是我目前使用的代码.

        var httpWebRequest = HttpWebRequest.Create(url) as HttpWebRequest;
        httpWebRequest.Method = "GET";
        httpWebRequest.ProtocolVersion = HttpVersion.Version11;
        httpWebRequest.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip,deflate");
        httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
        httpWebRequest.Proxy = null;
        httpWebRequest.KeepAlive = true;
        httpWebRequest.Accept = "text/html";

        string responseString = null;
        var httpWebResponse = httpWebRequest.GetResponse() as HttpWebResponse;

        using (var responseStream = httpWebResponse.GetResponseStream())
        {
            using (var streamReader = new StreamReader(responseStream))
            {
                responseString = streamReader.ReadToEnd();
            }
        }
Run Code Online (Sandbox Code Playgroud)

此外,欢迎任何其他优化建议.

SLa*_*aks 5

那是不对的.
HttpWebRequest对HTML或图像一无所知; 它只是发送原始HTTP请求.

您可以使用Fiddler查看到底发生了什么.

  • @JayWalker:但它对非浏览器请求毫无用处. (2认同)