我正在处理有很多重复行的东西:
# => [ [1, "A", 23626], [1, "A", 31314], [2, "B", 2143], [2, "B", 5247] ]
puts xs
# => [ [1, "A"], [2, "B"] ]
puts xs.uniq{ |x| x[0] }.map{ |x| [x[0], x[1]] }
Run Code Online (Sandbox Code Playgroud)
但是xs很大.我试图懒洋洋地加载它,但Enumerator #Lazy没有uniq方法.
我该如何懒惰地实现这一目标?
module EnumeratorLazyUniq
refine Enumerator::Lazy do
require 'set'
def uniq
set = Set.new
select { |e|
val = block_given? ? yield(e) : e
!set.include?(val).tap { |exists|
set << val unless exists
}
}
end
end
end
using EnumeratorLazyUniq
xs = [ [1, "A", 23626], [1, "A", 31314], [2, "B", 2143], [2, "B", 5247] ].to_enum.lazy
us = xs.uniq{ |x| x[0] }.map{ |x| [x[0], x[1]] }
puts us.to_a.inspect
# => [[1, "A"], [2, "B"]]
# Works with a block
puts us.class
# => Enumerator::Lazy
# Yep, still lazy.
ns = [1, 4, 6, 1, 2].to_enum.lazy
puts ns.uniq.to_a.inspect
# => [1, 4, 6, 2]
# Works without a block
Run Code Online (Sandbox Code Playgroud)
这是直接实现使用Set; 这意味着任何uniq'd值(例如[1, "A"],但不是流元素本身等[1, "A", 23626])会占用内存.
| 归档时间: |
|
| 查看次数: |
334 次 |
| 最近记录: |