Rab*_*ott 35 ruby arrays ruby-on-rails
我有两个我需要合并的数组,使用Union(|)运算符是PAINFULLY慢..有没有其他方法来完成数组合并?
此外,数组中充满了对象,而不是字符串.
数组中对象的示例
#<Article
id: 1,
xml_document_id: 1,
source: "<article><domain>events.waikato.ac</domain><excerpt...",
created_at: "2010-02-11 01:32:46",
updated_at: "2010-02-11 01:41:28"
>
Run Code Online (Sandbox Code Playgroud)
source是一小段XML.
编辑
抱歉! 通过'merge'我的意思是我不需要插入重复项.
A => [1, 2, 3, 4, 5]
B => [3, 4, 5, 6, 7]
A.magic_merge(B) #=> [1, 2, 3, 4, 5, 6, 7]
Run Code Online (Sandbox Code Playgroud)
理解整数实际上是Article对象,而Union运算符似乎永远占用
Ale*_*ner 62
这是一个对两种合并技术进行基准测试的脚本:使用管道运算符(a1 | a2),并使用concatenate-and-uniq((a1 + a2).uniq).另外两个基准测试给出了连接和uniq的时间.
require 'benchmark'
a1 = []; a2 = []
[a1, a2].each do |a|
1000000.times { a << rand(999999) }
end
puts "Merge with pipe:"
puts Benchmark.measure { a1 | a2 }
puts "Merge with concat and uniq:"
puts Benchmark.measure { (a1 + a2).uniq }
puts "Concat only:"
puts Benchmark.measure { a1 + a2 }
puts "Uniq only:"
b = a1 + a2
puts Benchmark.measure { b.uniq }
Run Code Online (Sandbox Code Playgroud)
在我的机器上(Ubuntu Karmic,Ruby 1.8.7),我得到如下输出:
Merge with pipe:
1.000000 0.030000 1.030000 ( 1.020562)
Merge with concat and uniq:
1.070000 0.000000 1.070000 ( 1.071448)
Concat only:
0.010000 0.000000 0.010000 ( 0.005888)
Uniq only:
0.980000 0.000000 0.980000 ( 0.981700)
Run Code Online (Sandbox Code Playgroud)
这表明这两种技术在速度上非常相似,这uniq是操作中较大的组成部分.这在直觉上是有道理的,是O(n)(充其量),而简单连接是O(1).
因此,如果您真的想加快速度,那么您需要了解如何<=>为数组中的对象实现运算符.我相信大部分时间都花在比较对象上,以确保最终数组中任何一对之间的不平等.
您是否需要在数组中按特定顺序排列项目?如果没有,您可能想要检查使用Sets 是否更快.
更新
添加到另一个回答者的代码:
require "set"
require "benchmark"
a1 = []; a2 = []
[a1, a2].each do |a|
1000000.times { a << rand(999999) }
end
s1, s2 = Set.new, Set.new
[s1, s2].each do |s|
1000000.times { s << rand(999999) }
end
puts "Merge with pipe:"
puts Benchmark.measure { a1 | a2 }
puts "Merge with concat and uniq:"
puts Benchmark.measure { (a1 + a2).uniq }
puts "Concat only:"
puts Benchmark.measure { a1 + a2 }
puts "Uniq only:"
b = a1 + a2
puts Benchmark.measure { b.uniq }
puts "Using sets"
puts Benchmark.measure {s1 + s2}
puts "Starting with arrays, but using sets"
puts Benchmark.measure {s3, s4 = [a1, a2].map{|a| Set.new(a)} ; (s3 + s4)}
Run Code Online (Sandbox Code Playgroud)
给(红宝石1.8.7(2008-08-11补丁级别72)[universal-darwin10.0])
Merge with pipe:
1.320000 0.040000 1.360000 ( 1.349563)
Merge with concat and uniq:
1.480000 0.030000 1.510000 ( 1.512295)
Concat only:
0.010000 0.000000 0.010000 ( 0.019812)
Uniq only:
1.460000 0.020000 1.480000 ( 1.486857)
Using sets
0.310000 0.010000 0.320000 ( 0.321982)
Starting with arrays, but using sets
2.340000 0.050000 2.390000 ( 2.384066)
Run Code Online (Sandbox Code Playgroud)
建议集合可能会或可能不会更快,具体取决于您的情况(许多合并或许多合并).