the*_*ist 9 ruby html-parsing nokogiri
我正在使用Nokogiri来拉动<h1>和<title>标签,但我无法获得这些:
<meta name="description" content="I design and develop websites and applications.">
<meta name="keywords" content="web designer,web developer">
Run Code Online (Sandbox Code Playgroud)
我有这个代码:
url = 'https://en.wikipedia.org/wiki/Emma_Watson'
page = Nokogiri::HTML(open(url))
puts page.css('title')[0].text puts page.css('h1')[0].text
puts page.css('description')
puts META DESCRIPTION
puts META KEYWORDS
Run Code Online (Sandbox Code Playgroud)
我查看了文档但没有找到任何内容.我会用正则表达式做这个吗?
谢谢.
这是我如何去做的:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<meta name="description" content="I design and develop websites and applications.">
<meta name="keywords" content="web designer,web developer">
EOT
contents = %w[description keywords].map { |name|
doc.at("meta[name='#{name}']")['content']
}
contents # => ["I design and develop websites and applications.", "web designer,web developer"]
Run Code Online (Sandbox Code Playgroud)
要么:
contents = doc.search("meta[name='description'], meta[name='keywords']").map { |n|
n['content']
}
contents # => ["I design and develop websites and applications.", "web designer,web developer"]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4281 次 |
| 最近记录: |