数据结构:迭代两个数组,转换为集合并在ruby中执行交叉操作

Ger*_*emy 2 ruby arrays set

让我们说我有a1a2:

a1 = [1,2,3]
a2 = [4,2,5]
Run Code Online (Sandbox Code Playgroud)

要查看是否a1共享任何元素a2,我可以遍历每个元素并比较每个元素:

def intersect?(x,y)
  a1.each do |x|
    a2.each do |y|
      if x == y return true
    end
  end
  false
end
Run Code Online (Sandbox Code Playgroud)

但更简单,(a1.to_set & a2.to_set).present?给我同样的答案.

我假设设置操作更快更有效?如果这是真的,考虑到.to_set每个阵列上的操作开销(如果有的话),它仍然是真的吗?

TIA

dbe*_*hur 5

steenslag的回答有一个有趣的观察发现array & array是快于set & set.看起来大多数惩罚似乎是从枚举的第一组的底层哈希中获取键的费用.将数组用于操作左侧并为右手设置的混合方法更快.如果您只想知道是否有任何交叉点,那么相同的方法#any?甚至更快:

#!/usr/bin/env ruby

require 'set'
require 'benchmark'

f = 10_000
ar1 = (1..(10*f)).to_a # 100_000 elements
ar2 = ((5*f)..(15*f)).to_a # also 100_000 elements
set1 = ar1.to_set
set2 = ar2.to_set
n = 10

Benchmark.bm(10) do |testcase|
  testcase.report('Array'){ n.times{ ar1 & ar2 } }
  testcase.report('Set'){ n.times{ set1 & set2 } }
  testcase.report('Set2'){ n.times{ ar1.select{ |element| set2.include? element } } }
  testcase.report('Set2present'){ n.times{ ar1.any?{ |element| set2.include? element } } }
end


$ ruby -v => ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin10.8.0]

                user     system      total        real
Array       0.680000   0.030000   0.710000 (  0.720882)
Set         1.130000   0.020000   1.150000 (  1.150571)
Set2        0.430000   0.000000   0.430000 (  0.434957)
Set2present  0.210000   0.010000   0.220000 (  0.220990)
Run Code Online (Sandbox Code Playgroud)