在html页面中解析application/atom + xml

Question

我们知道所有博客都显示他的博客的RSS源码

<link rel="alternate" type="application/rss+xml" title="MyBlog RSS Feed" href="http://feeds.feedburner.com/MyBlog" />

但是你知道任何正则表达式可以从中得到它

<link rel="alternate" type="application/rss+xml" title="MyBlog RSS Feed" href="http://feeds.feedburner.com/MyBlog" />

Answer 1

使用像这样的XPath查询:

//link[@type='application/rss+xml']/@href

它会为您提取任何RSS提要URL.永远不要用正则表达式解析XML或HTML. XPath专门为您轻松查询XML和HTML.它几乎可用于所有技术堆栈,包括.NET.