我已经阅读了几篇文章和示例,并且尚未找到在MongoDB中执行此SQL查询的有效方法(其中有数百万个 行 文件)
第一次尝试
(例如,从这个几乎重复的问题 - Mongo相当于SQL的SELECT DISTINCT?)
db.myCollection.distinct("myIndexedNonUniqueField").length
Run Code Online (Sandbox Code Playgroud)
显然我得到了这个错误,因为我的数据集非常庞大
Thu Aug 02 12:55:24 uncaught exception: distinct failed: {
"errmsg" : "exception: distinct too big, 16mb cap",
"code" : 10044,
"ok" : 0
}
Run Code Online (Sandbox Code Playgroud)
第二次尝试
我决定尝试做一组
db.myCollection.group({key: {myIndexedNonUniqueField: 1},
initial: {count: 0},
reduce: function (obj, prev) { prev.count++;} } );
Run Code Online (Sandbox Code Playgroud)
但我收到此错误消息:
exception: group() can't handle more than 20000 unique keys
Run Code Online (Sandbox Code Playgroud)
第三次尝试
我还没有尝试,但有几个涉及的建议 mapReduce
例如
如何获取包含集合中所有文档的某个字段的所有值的数组?
db.collection:
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be9"), "x" : 4 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), "x" : 5 }
Run Code Online (Sandbox Code Playgroud)
"db.collection.ListAllValuesForfield(x)"结果:[1,2,3,4,5]
此外,如果这个字段是一个数组怎么办?
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "y" : [1,2] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "y" : [3,4] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "y" : [5,6] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be9"), "y" : [1,2] }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990bea"), …
Run Code Online (Sandbox Code Playgroud) 我的actors
集合包含一个文档数组字段,称为acted_in
. 我不想返回像这样的大小acted_in.idmovies
:{$size: $acted_in.idmovies}
,我想返回内部不同值的数量$acted_in.idmovies
。我怎样才能做到这一点 ?
c1 = actors.aggregate([{"$match": {'$and': [{'fname': f_name},
{'lname': l_name}]}},
{"$project": {'first_name': '$fname',
'last_name': '$lname',
'gender': '$gender',
'distinct_movies_played_in': {'$size': '$acted_in.idmovies'}}}])
Run Code Online (Sandbox Code Playgroud) {
{
"date": "2017-09-04",
"description": "DD from my employer1",
"amount": 1000.33
},
{
"date": "2017-09-06",
"description": "DD from my employer1",
"amount": 1000.34
},
{
"date": "2017-09-06",
"description": "DD from my employer1",
"amount": 1000.35
},
{
"date": "2017-09-07",
"description": "DD from employer1",
"amount": 5000.00
},
{
"date": "2017-09-08",
"description": "DD from my employer1",
"amount": 2000.33
},
{
"date": "2017-09-09",
"description": "DD from my employer1",
"amount": 2000.33
},
{
"date": "2017-09-10",
"description": "DD from my employer1",
"amount": 2000.33
}
}
Run Code Online (Sandbox Code Playgroud)
我上面有这组对象,我正在尝试计算唯一日期的数量。有 7 …