$lookup 双重嵌套外部字段

mew*_*ewc 1 mongodb mongodb-query aggregation-framework

我有 2 个集合:

用户

{
    id: 1,
    name: "Michael", 
    starred: [1, 2]
}
Run Code Online (Sandbox Code Playgroud)

学校

{
    id: 1,
    name: "Uni", 
    faculties: [{
        id:1000, 
        name: "faculty1", 
        subjects: [
            {id: 1, name: "sub1"},
            {id: 2, name: "sub2"},
            {id: 3, name: "sub3"}
        ]
    }]
}
Run Code Online (Sandbox Code Playgroud)

现在,在我的用户集合中,我想查找并收集每个主题对象,其 ID 在已加星标中找到。IE。starred: [1,2]包含id我想要的主题。

所以最终结果应该返回

[{id: 1, name: sub1},{id: 2, name: sub2}]
Run Code Online (Sandbox Code Playgroud)

我目前正在使用 whis 聚合管道

{$match: {name: 'Michael'}},
{$unwind: "$faculties"},
{$unwind: "$faculties.subjects"},
{$lookup:
  {
     from: 'schools',
     localField: 'starred',
     foreignField: 'faculties.subjects.id',
     as: 'starredSubjects'
   }
},
{$project: {starredSubjects: 1}}
Run Code Online (Sandbox Code Playgroud)

但放松不起作用(我猜是因为我试图放松外国收藏,而不是本地收藏(即用户)。也foreignField: 'faculties.subjects.id没有返回任何东西。我错过了什么?

(旁注:MongoExplorer webstorm 插件上的测试很棒)。

Nei*_*unn 5

这确实不是一个很好的结构,并且有很好的理由。因此,在$lookup此处执行操作并不是一项简单的任务,因为“嵌套数组”有多种含义

你基本上想要

db.users.aggregate([
   { "$match": { "name": "Michael" } },
   { "$lookup": {
     "from": "schools",
     "localField": "starred",
     "foreignField": "faculties.subjects.id",
     "as": "subjects"
   }},
   { "$addFields": {
     "subjects": {
       "$filter": {
         "input": {
           "$reduce": {
             "input": {
               "$reduce": {
                 "input": "$subjects.faculties.subjects",
                 "initialValue": [],
                 "in": { "$concatArrays": [ "$$value", "$$this" ] }
               }
             },
             "initialValue": [],
             "in": { "$concatArrays": [ "$$value", "$$this" ] }
           }
         },
         "cond": { "$in": ["$$this.id", "$starred"] }
       }
     }
   }}
])
Run Code Online (Sandbox Code Playgroud)

或者使用 MongoDB 3.6 或更高版本:

db.users.aggregate([
  { "$match": { "name": "Michael" } },
  { "$lookup": {
    "from": "schools",
    "let": { "starred": "$starred" },
    "pipeline": [
      { "$match": {
        "$expr": {
          "$setIsSubset": [ 
            "$$starred",
            { "$reduce": {
              "input": "$faculties.subjects.id",
              "initialValue": [],
              "in": { "$concatArrays": [ "$$value", "$$this" ] }
            }}
          ]
        }
      }},
      { "$project": {
        "_id": 0,
        "subjects": {
          "$filter": {
            "input": {
              "$reduce": {
                "input":  "$faculties.subjects",
                "initialValue": [],
                "in": { "$concatArrays": [ "$$value", "$$this" ] }
              }
            },
            "cond": { "$in": [ "$$this.id", "$$starred" ] }
          }
        }
      }},
      { "$unwind": "$subjects" },
      { "$replaceRoot": { "newRoot": "$subjects" } }
    ],
    "as": "subjects"
  }}
])
Run Code Online (Sandbox Code Playgroud)

这两种方法本质上都依赖于$reduce 并且$concatArrays为了将“嵌套数组”内容“扁平化”为可用于比较的形式。两者之间的主要区别在于,在 MongoDB 3.6 之前,您实际上是从文档中提取所有“可能”匹配项,然后才能对内部数组条目进行任何“过滤”以仅匹配匹配项。

如果没有至少 MongoDB 3.4$reduce$in运算符,那么您实际上是在求助于$unwind

db.users.aggregate([
   { "$match": { "name": "Michael" } },
   { "$lookup": {
     "from": "schools",
     "localField": "starred",
     "foreignField": "faculties.subjects.id",
     "as": "subjects"
   }},
   { "$unwind": "$subjects" },
   { "$unwind": "$subjects.faculties" },
   { "$unwind": "$subjects.faculties.subjects" },
   { "$redact": {
      "$cond": {
        "if": {
          "$setIsSubset": [
            ["$subjects.faculties.subjects.id"],
            "$starred"
          ]
        },
        "then": "$$KEEP",
        "else": "$$PRUNE"
      }
   }},
   { "$group": {
      "_id": "$_id",
      "id": { "$first": "$id" },
      "name": { "$first": "$name" },
      "starred": { "$first": "$starred" },
      "subjects": { "$push": "$subjects.faculties.subjects" }
   }}
])
Run Code Online (Sandbox Code Playgroud)

当然使用$redactstage 来过滤逻辑比较,因为只有$exprMongoDB 3.6 并$setIsSubset"starred".

然后当然由于所有$unwind操作,您通常希望$group以重组阵列。

或者$lookup从另一个方向做:

db.schools.aggregate([
  { "$unwind": "$faculties" },
  { "$unwind": "$faculties.subjects" },
  { "$lookup": {
    "from": "users",
    "localField": "faculties.subjects.id",
    "foreignField": "starred",
    "as": "users"
  }},
  { "$unwind": "$users" },
  { "$match": { "users.name": "Michael" } },
  { "$group": {
    "_id": "$users._id",
    "id": { "$first": "$users.id" },
    "name": { "$first": "$users.name" },
    "starred": { "$first": "$users.starred" },
    "subjects": {
      "$push": "$faculties.subjects"
    }    
  }}
])
Run Code Online (Sandbox Code Playgroud)

最后一种形式并不理想,因为在$lookup完成之后(或从技术上讲“在”期间$lookup)您不会过滤“用户” 。但无论如何,它首先需要处理整个“学校”集合。

所有表单都返回相同的输出:

{
    "_id" : ObjectId("5aea649526a94676bb981df4"),
    "id" : 1,
    "name" : "Michael",
    "starred" : [
            1,
            2
    ],
    "subjects" : [
        {
                "id" : 1,
                "name" : "sub1"
        },
        {
                "id" : 2,
                "name" : "sub2"
        }

    ]
}
Run Code Online (Sandbox Code Playgroud)

您只有"subjects"来自相关文档的内部数组的详细信息,这些详细信息实际上与"starred"当前用户的值匹配。


综上所述,用MongoDB“嵌套数组”并不是一个好主意。在 MongoDB 3.6 之前,您甚至无法对“嵌套数组”进行原子更新,即使进行了允许它的更改,执行任何查询操作(尤其是涉及连接和过滤的操作)充其量仍然是“困难的”

构造“嵌套数组”是一个常见的“新手”错误,因为您似乎认为自己正在更好地“组织”事物。但实际上,它更像是一种“反模式”,您确实应该考虑一种“更扁平”的结构,例如:

{
    "_id" : ObjectId("5aea651326a94676bb981df5"),
    "id" : 1,
    "name" : "Uni",
    "subjects" : [
        {
                "id" : 1,
                "name" : "sub1",
                "facultyId": 1000,
                "facultyName": "faculty1"
        },
        {
                "id" : 2,
                "name" : "sub2",
                "facultyId": 1000,
                "facultyName": "faculty1"

        },
        {
                "id" : 3,
                "name" : "sub3",
                "facultyId": 1000,
                "facultyName": "faculty1"

        }
    ]
}
Run Code Online (Sandbox Code Playgroud)

“更”容易使用,当然在需要的地方执行“连接”。