我有以下正则表达式:
^(<span style=.*?font-weight:bold.*?>.*?</span>)
Run Code Online (Sandbox Code Playgroud)
它匹配以下代码:
<span style="font-family:Arial; font-size:10pt"> r.</span></p><p style="margin:0pt"><span style="font-family:Arial; font-size:10pt; font-weight:bold"> </span>
Run Code Online (Sandbox Code Playgroud)
但我想只匹配这部分(最后一个包含font-weight:粗体样式)
<span style="font-family:Arial; font-size:10pt; font-weight:bold"> </span>
Run Code Online (Sandbox Code Playgroud)
使用HTML Agility Pack解析html:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(htmlContent);
var boldSpans = from s in doc.DocumentNode.SelectNodes("//span")
let style = s.Attributes["style"].Value
where style.Contains("font-weight:bold")
select s;
Run Code Online (Sandbox Code Playgroud)
甚至更好的xpath,它选择一行中的所有节点:
doc.DocumentNode.SelectNodes("//span[contains(@style, 'font-weight:bold')]")
Run Code Online (Sandbox Code Playgroud)