Onc*_*nca 2 python mongodb pymongo mongodb-query aggregation-framework
使用PyMongo,按一键组合似乎可以:
results = collection.group(key={"scan_status":0}, condition={'date': {'$gte': startdate}}, initial={"count": 0}, reduce=reducer)
结果:
{u'count': 215339.0, u'scan_status': u'PENDING'} {u'count': 617263.0, u'scan_status': u'DONE'}
但是当我尝试按多个键分组时,出现异常:
results = collection.group(key={"scan_status":0,"date":0}, condition={'date': {'$gte': startdate}}, initial={"count": 0}, reduce=reducer)
如何正确按多个字段分组?
如果您要计算两个键,则可以使用.group()更好的选择.aggregate()。
这将使用“本机代码运算符”,而不是JavaScript解释的代码,.group()这与您要实现的基本“分组”操作相同。
特别是$group管道运算符:
result = collection.aggregate([
    # Matchn the documents possible
    { "$match": { "date": { "$gte": startdate } } },
    # Group the documents and "count" via $sum on the values
    { "$group": {
        "_id": {
            "scan_status": "$scan_status",
            "date": "$date"
        },
        "count": { "$sum": 1 }
    }}
])
实际上,您可能想要一些将“日期”缩短到不同时期的方法。如:
result = collection.aggregate([
    # Matchn the documents possible
    { "$match": { "date": { "$gte": startdate } } },
    # Group the documents and "count" via $sum on the values
    { "$group": {
        "_id": {
            "scan_status": "$scan_status",
            "date": {
                "year": { "$year": "$date" },
                "month": { "$month" "$date" },
                "day": { "$dayOfMonth": "$date" }
            }
        },
        "count": { "$sum": 1 }
    }}
])
使用日期汇总运算符,如下所示。
或使用基本的“日期数学”:
import datetime
from datetime import date
result = collection.aggregate([
    # Matchn the documents possible
    { "$match": { "date": { "$gte": startdate } } },
    # Group the documents and "count" via $sum on the values
    # use "epoch" "1970-01-01" as a base to convert to integer
    { "$group": {
        "_id": {
            "scan_status": "$scan_status",
            "date": {
                "$subtract": [
                    { "$subtract": [ "$date", date.fromtimestamp(0) ] },
                    { "$mod": [
                        { "$subtract": [ "$date", date.fromtimestamp(0) ] },
                        1000 * 60 * 60 * 24
                    ]}
                ]
            }
        },
        "count": { "$sum": 1 }
    }}
])
它将从“时代”开始返回整数值,而不是互补值对象。
但是,所有这些选项都比.group()使用本地编码例程更好,并且比其他方式需要提供的JavaScript代码执行它们的动作要快得多。
| 归档时间: | 
 | 
| 查看次数: | 2501 次 | 
| 最近记录: |