我在mongodb中有一个md5的集合.我想找到所有重复的内容.md5列已编制索引.你知道使用map reduce做任何快速的方法吗?或者我应该迭代所有记录并手动检查重复项?
我目前使用map reduce的方法几乎两次迭代集合(假设重复数量非常少):
res = db.files.mapReduce(
function () {
emit(this.md5, 1);
},
function (key, vals) {
return Array.sum(vals);
}
)
db[res.result].find({value: {$gte:1}}).forEach(
function (obj) {
out.duplicates.insert(obj)
});
Run Code Online (Sandbox Code Playgroud) Rails 4.2.5
, Mongoid 5.1.0
我有三个型号- Mailbox
,Communication
和Message
.
mailbox.rb
class Mailbox
include Mongoid::Document
belongs_to :user
has_many :communications
end
Run Code Online (Sandbox Code Playgroud)
communication.rb
class Communication
include Mongoid::Document
include Mongoid::Timestamps
include AASM
belongs_to :mailbox
has_and_belongs_to_many :messages, autosave: true
field :read_at, type: DateTime
field :box, type: String
field :touched_at, type: DateTime
field :import_thread_id, type: Integer
scope :inbox, -> { where(:box => 'inbox') }
end
Run Code Online (Sandbox Code Playgroud)
message.rb
class Message
include Mongoid::Document
include Mongoid::Timestamps
attr_accessor :communication_id
has_and_belongs_to_many :communications, autosave: true
belongs_to :from_user, class_name: 'User'
belongs_to …
Run Code Online (Sandbox Code Playgroud) ruby-on-rails mongodb mongoid mongodb-query aggregation-framework