我试图做一个webscraper,从html文件中获取css / js / images的所有下载链接。
问题
第一个断点确实命中,但第二个断点未命中“ Continue”之后。
我正在谈论的代码:
private static async void GetHtml(string url, string downloadDir)
{
//Get html data, create and load htmldocument
HttpClient httpClient = new HttpClient();
//This code gets executed
var html = await httpClient.GetStringAsync(url);
//This code not
Console.ReadLine();
var htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml(html);
//Get all css download urls
var linkUrl = htmlDocument.DocumentNode.Descendants("link")
.Where(node => node.GetAttributeValue("type", "")
.Equals("text/css"))
.Select(node=>node.GetAttributeValue("href",""))
.ToList();
//Downloading css, js, images and source code
using (var client = new WebClient()) …Run Code Online (Sandbox Code Playgroud)