我有一个简单但巨大的xml文件,如下所示.我想使用SAX解析它,只在title标签之间打印文本.
<root>
<site>some site</site>
<title>good title</title>
</root>
Run Code Online (Sandbox Code Playgroud)
我有以下代码:
require 'rubygems'
require 'nokogiri'
include Nokogiri
class PostCallbacks < XML::SAX::Document
def start_element(element, attributes)
if element == 'title'
puts "found title"
end
end
def characters(text)
puts text
end
end
parser = XML::SAX::Parser.new(PostCallbacks.new)
parser.parse_file("myfile.xml")
Run Code Online (Sandbox Code Playgroud)
问题是它在所有标签之间打印文本.如何在title标签之间打印文字?
你只需要跟踪你何时进入,<title>以便characters知道什么时候应该注意.这样的东西(未经测试的代码)也许:
class PostCallbacks < XML::SAX::Document
def initialize
@in_title = false
end
def start_element(element, attributes)
if element == 'title'
puts "found title"
@in_title = true
end
end
def end_element(element)
# Doesn't really matter what element we're closing unless there is nesting,
# then you'd want "@in_title = false if element == 'title'"
@in_title = false
end
def characters(text)
puts text if @in_title
end
end
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1835 次 |
| 最近记录: |