我有像http://www.heureka.cz/direct/xml-export/shops/heureka-sekce.xml这样的XML文件.我无法改变它,因为它不是我的.它只是从另一个网站解析.
这是XML(带结构):
<HEUREKA>
<CATEGORY>
<CATEGORY_ID>971</CATEGORY_ID>
<CATEGORY_NAME>Auto-moto</CATEGORY_NAME>
<CATEGORY>
<CATEGORY_ID>881</CATEGORY_ID>
<CATEGORY_NAME>Alkohol testery</CATEGORY_NAME>
<CATEGORY_FULLNAME>Heureka.cz | Auto-moto | Alkohol testery</CATEGORY_FULLNAME>
</CATEGORY>
</CATEGORY>
</HEUREKA>
Run Code Online (Sandbox Code Playgroud)
感谢所有评论,这里是最终的代码
def heureka
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::XML(open("http://www.heureka.cz/direct/xml-export/shops/heureka-sekce.xml"))
doc.xpath("//CATEGORY[CATEGORY_FULLNAME]").each do |node|
record = Heureka.where("name" => node.css('CATEGORY_NAME').inner_text).first_or_initialize
record.fullname=node.xpath('CATEGORY_FULLNAME').inner_text
record.name=node.xpath('CATEGORY_NAME').inner_text
record.save unless record.fullname.blank?
end
end
Run Code Online (Sandbox Code Playgroud)