如何从数组中删除重复的条目?

P K*_*P K 18 mongodb

如何从数组中删除重复的条目?

在下面的例子中,"C++中的算法"被添加两次.

$ unset修饰符删除特定字段但如何从字段中删除条目?

> db.users.find()

{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), 
 "favorites" : { "books" : [ "Algorithms in C++",    
                            "The Art of Computer Programmning", 
                            "Graph Theory",      
                            "Algorithms in C++" ] }, 
  "name" : "robert" }
Run Code Online (Sandbox Code Playgroud)

kyn*_*nan 32

由于MongoDB的2.2可以使用聚合框架$unwind,$group$project舞台来实现:

db.users.aggregate([{$unwind: '$favorites.books'},
                    {$group: {_id: '$_id',
                              books: {$addToSet: '$favorites.books'},
                              name: {$first: '$name'}}},
                    {$project: {'favorites.books': '$books', name: '$name'}}
                   ])
Run Code Online (Sandbox Code Playgroud)

请注意,需要$project重命名favorites字段,因为$group聚合字段不能嵌套.

  • 在`$ group`阶段你为什么要使用`name:{$ first:'$ name'}`? (2认同)

Den*_*zov 5

最简单的解决方案是使用setUnion(Mongo 2.6+):

db.users.aggregate([
    {'$addFields': {'favorites.books': {'$setUnion': ['$favorites.books', []]}}}
])
Run Code Online (Sandbox Code Playgroud)

另一个(更冗长的)版本基于@kynan的答案,但是保留了所有其他字段而未明确指定它们(Mongo 3.4+):

> db.users.aggregate([
    {'$unwind': {
        'path': '$favorites.books',
        // output the document even if its list of books is empty
        'preserveNullAndEmptyArrays': true
    }},
    {'$group': {
        '_id': '$_id',
        'books': {'$addToSet': '$favorites.books'},
        // arbitrary name that doesn't exist on any document
        '_other_fields': {'$first': '$$ROOT'},
    }},
    {
      // the field, in the resulting document, has the value from the last document merged for the field. (c) docs
      // so the new deduped array value will be used
      '$replaceRoot': {'newRoot': {'$mergeObjects': ['$_other_fields', "$$ROOT"]}}
    },
    // this stage wouldn't be necessary if the field wasn't nested
    {'$addFields': {'favorites.books': '$books'}},
    {'$project': {'_other_fields': 0, 'books': 0}}
])

{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), "name" : "robert", "favorites" : 
{ "books" : [ "The Art of Computer Programmning", "Graph Theory", "Algorithms in C++" ] } }    
Run Code Online (Sandbox Code Playgroud)


Bab*_*aba 3

你所要做的就是使用 MapReduce 来检测和计算重复标签..然后使用它$set来替换整个书籍{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"),

这已经在这里讨论了七次..请参阅

使用 MapReduce 删除重复记录

在 mongodb 索引列上查找重复项的快速方法

http://csanz.posterous.com/look-for-duplicates-using-mongodb-mapreduce

http://www.mongodb.org/display/DOCS/MapReduce

如何通过MapReduce删除MongoDB中的重复记录?

  • 不要只发布链接,其中一个现已损坏:( (10认同)