组合数组中的唯一项

bla*_*p07 3 mongodb mongodb-query aggregation-framework

我有一个正在查询的数据集。数据如下:

db.activity.insert(
    {
        "_id" : ObjectId("5908e64e3b03ca372dc945d5"),
        "startDate" : ISODate("2017-05-06T00:00:00Z"),
        "details" : [
            {
                "code" : "2",
                "_id" : ObjectId("5908ebf96ae5003a4471c9b2"),
                "walkDistance" : "03",
                "jogDistance" : "01",
                "runDistance" : "08",
                "sprintDistance" : "01"
            }
        ]
    }
)

db.activity.insert(
    {
        "_id" : ObjectId("58f79163bebac50d5b2ae760"),
        "startDate" : ISODate("2017-05-07T00:00:00Z"),
        "details" : [
            {
                "code" : "2",
                "_id" : ObjectId("58f7948fbebac50d5b2ae7f2"),
                "walkDistance" : "01",
                "jogDistance" : "02",
                "runDistance" : "09",
                "sprintDistance" : ""
            }
        ]
    }
)
Run Code Online (Sandbox Code Playgroud)

我想要的输出如下所示:

[
  {
    "_id": null,
    "uniqueValues": [
      "03",
      "01",
      "08",
      "02",
      "09"
    ]
  }
]
Run Code Online (Sandbox Code Playgroud)

为了做到这一点,我开发了以下代码:

db.activity.aggregate([
    {
        $facet: {
            "walk": [
                {$unwind: '$details'},
                {$group: {_id: null, uniqueValues: {$addToSet: "$details.walkDistance"}}}
            ], "jog": [
                {$unwind: '$details'},
                {$group: {_id: null, uniqueValues: {$addToSet: "$details.jogDistance"}}}
            ], "run": [
                {$unwind: '$details'},
                {$group: {_id: null, uniqueValues: {$addToSet: "$details.runDistance"}}}
            ], "sprint": [
                {$unwind: '$details'},
                {$group: {_id: null, uniqueValues: {$addToSet: "$details.sprintDistance"}}}
            ]
        }
    }])
Run Code Online (Sandbox Code Playgroud)

然而,我仍然得到 4 个不同的方面,它们有自己的方面_id: nulluniqueValues阵列。如何更改查询,以便它们全部包含在单个数组中,并且""也被排除。

Nei*_*unn 5

$facet确实不是在这里使用的最好的东西。您实际上应该只是使用and来应用$concatArrays和过滤结果:$setDifference$filter

db.activity.aggregate([
  { "$project": {
    "_id": 0,
    "unique": {
      "$filter": {
        "input": {
          "$setDifference": [
            { "$concatArrays": [ 
              "$details.walkDistance",
              "$details.jogDistance",
              "$details.runDistance",
              "$details.sprintDistance"
            ]},
            []
          ]
        },
        "cond": { "$ne": [ "$$this", "" ] }
      }
    }
  }},
  { "$unwind": "$unique" },
  { "$group": {
    "_id": null,
    "uniqueArray": { "$addToSet": "$unique" }  
  }}
])
Run Code Online (Sandbox Code Playgroud)

返回结果:

/* 1 */
{
    "_id" : null,
    "uniqueArray" : [ 
        "09", 
        "03", 
        "01", 
        "02", 
        "08"
    ]
}
Run Code Online (Sandbox Code Playgroud)

因此,在使用 将所有数组值放入单个数组后$concatArrays,您可以申请$setDifference将列表缩减为“唯一”值。删除您不需要的值$filter""

然后,只需应用$unwind单一和简化列表并将其重新组合在一起,$group以便$addToSet仅在文档中保留唯一值。

您也可以只$concatArrays使用 and then$unwind$match,但其他运算符实际上并不会花费太多,并且通过在到达$unwind. 所以最好这样做。

实际上,这甚至可以进一步分解,简单地说$setUnion$setDifference因为我们毕竟在谈论“集合”:

db.activity.aggregate([
  { "$project": {
    "_id": 0,
    "unique": {
      "$setDifference": [
        { "$setUnion": [ 
          "$details.walkDistance",
          "$details.jogDistance",
          "$details.runDistance",
          "$details.sprintDistance"
        ]},
        [""]
      ]
    }
  }},
  { "$unwind": "$unique" },
  { "$group": {
    "_id": null,
    "uniqueArray": { "$addToSet": "$unique" }  
  }}
])
Run Code Online (Sandbox Code Playgroud)

这意味着整个语句将兼容 MongoDB 2.6,或者如果所有形式都$details.walkDistance使用以下命令以更长的形式写出$map

  "$setDifference": [
    { "$setUnion": [ 
      { "$map": { "input": "$details", "as": "d", "in": "$$d.walkDistance" } },
      { "$map": { "input": "$details", "as": "d", "in": "$$d.jogDistance" } },
      { "$map": { "input": "$details", "as": "d", "in": "$$d.runDistance" } },
      { "$map": { "input": "$details", "as": "d", "in": "$$d.sprintDistance" } }
    ]},
    [""]
  ]
Run Code Online (Sandbox Code Playgroud)

另一方面,运行$facet会导致对整个集合进行“强力”解析,以获取数组中的每个属性,并$unwind在每次传递时进行处理。所以这是一种获取结果的非常低效的方法。所以不要那样做。