Oli*_*oyd 26 mongodb aggregation-framework
我想使用mongoDB的聚合框架来运行SQL中看起来有点像:
SELECT SUM(A), B, C from myTable GROUP BY B, C;
Run Code Online (Sandbox Code Playgroud)
文档说明:
您可以从管道中的文档指定单个字段,先前计算的值,或者从多个传入字段组成的聚合键.
但目前还不清楚"几个传入领域的聚合密钥"实际上是什么?
我的数据集有点像这样:
[{ "timeStamp" : 1341834988666, "label" : "sharon", "responseCode" : "200", "value" : 10, "success" : "true"},
{ "timeStamp" : 1341834988676, "label" : "paul", "responseCode" : "200", "value" : 60, "success" : "true"},
{ "timeStamp" : 1341834988686, "label" : "paul", "responseCode" : "404", "value" : 15, "success" : "true"},
{ "timeStamp" : 1341834988696, "label" : "sharon", "responseCode" : "200", "value" : 35, "success" : "false"},
{ "timeStamp" : 1341834988166, "label" : "paul", "responseCode" : "200", "value" : 40, "success" : "true"},
{ "timeStamp" : 1341834988266, "label" : "paul", "responseCode" : "404", "value" : 99, "success" : "false"}]
Run Code Online (Sandbox Code Playgroud)
我的查询如下所示:
resultsCollection.aggregate(
{ $match : { testid : testid} },
{ $skip : alreadyRead },
{ $project : {
timeStamp : 1 ,
label : 1,
responseCode : 1 ,
value : 1,
success : 1
}},
{ $group : {
_id : "$label",
max_timeStamp : { $timeStamp : 1 },
count_responseCode : { $sum : 1 },
avg_value : { $sum : "$value" },
count_success : { $sum : 1 }
}},
{ $group : {
?
}}
);
Run Code Online (Sandbox Code Playgroud)
我的直觉是尝试将结果传递给第二组,我知道你可以做到这一点,但它不会起作用,因为第一组已经过多地减少了数据集,并且所需的细节水平也丢失了.
我想要做的是分组使用label,responseCode并success从结果中获取值的总和.看起来应该有点像:
label | code | success | sum_of_values | count
sharon | 200 | true | 10 | 1
sharon | 200 | false | 35 | 1
paul | 200 | true | 100 | 2
paul | 404 | true | 15 | 1
paul | 404 | false | 99 | 1
Run Code Online (Sandbox Code Playgroud)
哪里有五组:
1. { "timeStamp" : 1341834988666, "label" : "sharon", "responseCode" : "200", "value" : 10, "success" : "true"}
2. { "timeStamp" : 1341834988696, "label" : "sharon", "responseCode" : "200", "value" : 35, "success" : "false"}
3. { "timeStamp" : 1341834988676, "label" : "paul", "responseCode" : "200", "value" : 60, "success" : "true"}
{ "timeStamp" : 1341834988166, "label" : "paul", "responseCode" : "200", "value" : 40, "success" : "true"}
4. { "timeStamp" : 1341834988686, "label" : "paul", "responseCode" : "404", "value" : 15, "success" : "true"}
5. { "timeStamp" : 1341834988266, "label" : "paul", "responseCode" : "404", "value" : 99, "success" : "false"}
Run Code Online (Sandbox Code Playgroud)
Oli*_*oyd 37
好的,所以解决方案是为_id值指定聚合键.这在此记录为:
您可以从管道中的文档指定单个字段,先前计算的值,或者从多个传入字段组成的聚合键.
但它实际上并没有定义聚合键的格式.阅读前面的文档,我看到以前的collection.group方法可以采用多个字段,并且在新框架中使用相同的结构.
因此,要对可以使用的多个字段进行分组 _id : { success:'$success', responseCode:'$responseCode', label:'$label'}
如:
resultsCollection.aggregate(
{ $match : { testid : testid} },
{ $skip : alreadyRead },
{ $project : {
timeStamp : 1 ,
label : 1,
responseCode : 1 ,
value : 1,
success : 1
}},
{ $group : {
_id : { success:'$success', responseCode:'$responseCode', label:'$label'},
max_timeStamp : { $timeStamp : 1 },
count_responseCode : { $sum : 1 },
avg_value : { $sum : "$value" },
count_success : { $sum : 1 }
}}
);
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
28026 次 |
| 最近记录: |