Kus*_*ush 1 ruby xml xpath nokogiri
我的代码看起来像:
file = Nokogiri::XML(File.open('file.xml'))
test = file.xpath("//title") #all <title> elements in xml file
Run Code Online (Sandbox Code Playgroud)
然后,当我尝试:
puts test.uniq
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
undefined method `uniq' for #<Nokogiri::XML::NodeSet:0x000000011b8bf8>
Run Code Online (Sandbox Code Playgroud)
是test不是数组?如果不是,我该怎么做呢?
否则,我如何只从test数组中获取唯一值?
Is test not an array? If it's not, how do I make it one?
test will be a NodeSet:
Nokogiri::XML('<xml><foo/></xml>').xpath('//foo').class
=> Nokogiri::XML::NodeSet
foo = Nokogiri::XML('<xml><foo/></xml>').xpath('//foo')
=> [#<Nokogiri::XML::Element:0x8109a674 name="foo">]
foo.is_a? Array
=> false
foo.is_a? Enumerable
=> true
Run Code Online (Sandbox Code Playgroud)
To turn it into an array use to_a:
foo.respond_to? :to_a
=> true
Run Code Online (Sandbox Code Playgroud)
However, that's not necessary because it also responds to map, each, and all the normal things we'd expect when iterating an Array because it includes Enumerable. map, by definition, automatically returns an array, so there's the conversion you wondered about in your comments and your question.
foo.methods.sort - Object.methods
=> [:%, :&, :+, :-, :/, :<<, :[], :add_class, :after, :all?, :any?, :at, :at_css, :at_xpath, :attr, :attribute, :before, :children, :chunk, :collect, :collect_concat, :count, :css, :cycle, :delete, :detect, :document, :document=, :drop, :drop_while, :each, :each_cons, :each_entry, :each_slice, :each_with_index, :each_with_object, :empty?, :entries, :filter, :find, :find_all, :find_index, :first, :flat_map, :grep, :group_by, :index, :inject, :inner_html, :inner_text, :last, :length, :map, :max, :max_by, :member?, :min, :min_by, :minmax, :minmax_by, :none?, :one?, :partition, :pop, :push, :reduce, :reject, :remove, :remove_attr, :remove_class, :reverse, :reverse_each, :search, :select, :set, :shift, :size, :slice, :slice_before, :sort, :sort_by, :take, :take_while, :text, :to_a, :to_ary, :to_html, :to_xhtml, :to_xml, :unlink, :wrap, :xpath, :zip, :|]
Run Code Online (Sandbox Code Playgroud)
I suspect the reason uniq isn't implemented is it's very difficult to figure out how to test for uniqueness. A very simple tag, like:
<div class="foo" id="bar">
Run Code Online (Sandbox Code Playgroud)
is functionally the same as:
<div id="bar" class="foo">
Run Code Online (Sandbox Code Playgroud)
but the obvious to_s test will fail because they won't match a string equality test.
The tags would have to be normalized on the fly to put their parameters into the same order, then converted to strings, but what if the class parameter was "foo1 foo2" in the first tag and "foo2 foo1" in the second? Does the uniq code have to dive into specific parameters and reorder them? And, what if the tag is a container, like div is? Should the children of the node also be considered in the uniq test?
I think that's a can of worms most of us would back away from quickly, and those who'd jump into trying to define uniq would learn a very valuable lesson about rabbit holes. Instead, you are free to define uniq as fits your particular application, so it makes sense to you. I think that's a great design decision for Nokogiri's authors.
| 归档时间: |
|
| 查看次数: |
2165 次 |
| 最近记录: |