从 MongoDB 集合中获取前 N 条和后 N 条记录

Nan*_*ndi 1 mongodb aggregation-framework

我有一个用例,需要显示组和排序聚合中的前 10 名和最后 10 名结果。我尝试使用$limit,但这不会让下一个聚合器处理完整的数据。

db.collection.aggregate([groupAggregator, sortAggregator, { $limit: 10 }, /*only 10 records available*/] 
Run Code Online (Sandbox Code Playgroud)

如何在管道中间对整个集合执行聚合?我在用MongoDB 3.2.9。如果这是不可能的,有没有办法联合两个聚合,第一个是 ,top 10 (ASC SORTED)第二个是last 10 (DESC SORTED)

如果不是为了小组聚合,我会使用该db.collection.find({}).sort().filter()策略,但小组需要完成。

组聚合得到的数据

{_id: "", ..., avg_count: 10}
{_id: "", ..., avg_count: 1}
{_id: "", ..., avg_count: 2}
{_id: "", ..., avg_count: 5}
{_id: "", ..., avg_count: 8}
{_id: "", ..., avg_count: 3}
{_id: "", ..., avg_count: 4}
{_id: "", ..., avg_count: 6}
{_id: "", ..., avg_count: 7}
{_id: "", ..., avg_count: 9}
Run Code Online (Sandbox Code Playgroud)

Sort聚合得到的数据

{_id: "", ..., avg_count: 1}
{_id: "", ..., avg_count: 2}
{_id: "", ..., avg_count: 3}
{_id: "", ..., avg_count: 4}
{_id: "", ..., avg_count: 5}
{_id: "", ..., avg_count: 6}
{_id: "", ..., avg_count: 7}
{_id: "", ..., avg_count: 8}
{_id: "", ..., avg_count: 9}
{_id: "", ..., avg_count: 10}
Run Code Online (Sandbox Code Playgroud)

期望的输出:

获取前 2 个和后 2 个文档

{_id: "", ..., avg_count: 1}
{_id: "", ..., avg_count: 2}
{_id: "", ..., avg_count: 9}
{_id: "", ..., avg_count: 10}
Run Code Online (Sandbox Code Playgroud)

注:以上只是样本数据,实际数据没有准确的编号。

dni*_*ess 5

如果你理解正确,这是获得这种行为的一种方法:

db.collection.aggregate([{
    $sort: { "your_sort_field": 1 } // sort the data
}, {
    $group: {
        _id: null, // group everything into one single bucket
        docs: { $push: "$$ROOT" } // push all documents into an array (this will be massive for huge collections...)
    }
}, {
    $project: {
        "docsTop10": { $slice: [ "$docs", 10 ] }, // take the first 10 elements from the ASC sorted array
        "docsBottom10": { $reverseArray: { $slice: [ "$docs", -10 ] } } // take the last 10 elements from the array but reverse their order
    }
}])
Run Code Online (Sandbox Code Playgroud)

如果你想将所有内容都放在一个属性中,你可以在最后阶段简单地使用$concatArrays :

$project: {
    "result": { $concatArrays: [ { $slice: [ "$docs", 10 ] }, { $reverseArray: { $slice: [ "$docs", -10 ] } } ] }
}
Run Code Online (Sandbox Code Playgroud)

不幸的是,您的 MongoDB 版本中还没有$replaceRoot,否则您可以更好地展平结果。

另外,由于$reverseArray在 v3.2 中似乎也不可用,因此您只需在$project阶段之后再次删除该运算符以及$unwind$sort即可:

{
    $project: {
        _id: 0,
        "result": { $concatArrays: [ { $slice: [ "$docs", 10 ] }, { $slice: [ "$docs", -10 ] } ] }
    }
}, {
    $unwind: "$result"
}, {
    $sort: { "result.your_sort_field": 1 } // sort the data
}
Run Code Online (Sandbox Code Playgroud)

另一种选择是使用$facet(仅从 v3.4 开始),这肯定会更快,因为 MongoDB 能够很好地优化排序/限制组合:

db.collection.aggregate([{
    $facet: { // start two separate pipeline
        "docsTop10": [
            { $sort: { "your_sort_field": 1 } }, // sort ASC
            { $limit: 10 } // take top 10
        ],
        "docsBottom10": [
            { $sort: { "your_sort_field": -1 } }, // sort DESC
            { $limit: 10 } // take top 10
        ]
    }
}])
Run Code Online (Sandbox Code Playgroud)