Ali*_*eza 5 performance index mongodb query-performance
我的数据库有大约 30M 记录,集合大小约为 100GB(总文档和索引)。
\n\n我有一个复合索引,可以根据 user_id 和其他一些字段(例如:is_active
、is_logged_in
等)过滤数据。
我发现MongoDB Compass
查询速度很慢,大约需要 10 秒、20 秒甚至 40 秒!我运行了完全相同的查询,并且在不到 500 毫秒的时间内获取结果(尽管它可能会在第二次尝试时被缓存)。
当我获得该持续统计op
数据时,我会看到以下锁定状态:
"lockStats": {\n "Global": {\n "acquireCount": {\n "r": 574\n }\n },\n "MMAPV1Journal": {\n "acquireCount": {\n "r": 295\n },\n "acquireWaitCount": {\n "r": 2\n },\n "timeAcquiringMicros": {\n "r": 15494\n }\n },\n }\n
Run Code Online (Sandbox Code Playgroud)\n\nacquireCount
:与具有以下状态的快速查询(在另一个集合上)相比,操作在指定模式下获取锁的次数如此之高:
"lockStats": {\n "Global": {\n "acquireCount": {\n "r": 2\n }\n },\n "MMAPV1Journal": {\n "acquireCount": {\n "r": 1\n }\n },\n "Database": {\n "acquireCount": {\n "r": 1\n }\n },\n "Collection": {\n "acquireCount": {\n "R": 1\n }\n }\n }\n
Run Code Online (Sandbox Code Playgroud)\n\n当操作速度很慢,并且拥有许多记录的用户花费很长时间时,几秒钟后就会对所有其他操作产生多米诺骨牌效应。
\n\n当我解释对大集合的查询时,我可以看到它使用索引的结果:
\n\n{\n "queryPlanner" : {\n "plannerVersion" : 1,\n "namespace" : "cuda.call_history",\n "indexFilterSet" : false,\n "parsedQuery" : {\n "$and" : [ \n {\n "$or" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$eq" : false\n }\n }, \n {\n "$nor" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$exists" : true\n }\n }\n ]\n }\n ]\n }, \n {\n "is_removed" : {\n "$eq" : false\n }\n }, \n {\n "user_id" : {\n "$eq" : "00000000000040008000000000002a5d"\n }\n }, \n {\n "trk.0.direction" : {\n "$eq" : "ingress"\n }\n }, \n {\n "trk.0.type" : {\n "$eq" : "fax"\n }\n }, \n {\n "date" : {\n "$lt" : "2018-01-09 10:36:31"\n }\n }, \n {\n "date" : {\n "$gt" : "1970-01-01 00:00:00"\n }\n }, \n {\n "trk.0.data.status" : {\n "$in" : [ \n "p_received", \n "success"\n ]\n }\n }\n ]\n },\n "winningPlan" : {\n "stage" : "FETCH",\n "filter" : {\n "$and" : [ \n {\n "$or" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$eq" : false\n }\n }, \n {\n "$nor" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$exists" : true\n }\n }\n ]\n }\n ]\n }, \n {\n "trk.0.type" : {\n "$eq" : "fax"\n }\n }, \n {\n "date" : {\n "$lt" : "2018-01-09 10:36:31"\n }\n }, \n {\n "date" : {\n "$gt" : "1970-01-01 00:00:00"\n }\n }\n ]\n },\n "inputStage" : {\n "stage" : "IXSCAN",\n "keyPattern" : {\n "user_id" : 1,\n "trk.0.direction" : 1,\n "is_read" : 1,\n "trk.0.data.status" : 1,\n "is_removed" : 1\n },\n "indexName" : "user_id_direction_is_read_status_is_removed",\n "isMultiKey" : false,\n "isUnique" : false,\n "isSparse" : false,\n "isPartial" : false,\n "indexVersion" : 1,\n "direction" : "forward",\n "indexBounds" : {\n "user_id" : [ \n "[\\"00000000000040008000000000002a5d\\", \\"00000000000040008000000000002a5d\\"]"\n ],\n "trk.0.direction" : [ \n "[\\"ingress\\", \\"ingress\\"]"\n ],\n "is_read" : [ \n "[MinKey, MaxKey]"\n ],\n "trk.0.data.status" : [ \n "[\\"p_received\\", \\"p_received\\"]", \n "[\\"success\\", \\"success\\"]"\n ],\n "is_removed" : [ \n "[false, false]"\n ]\n }\n }\n },\n "rejectedPlans" : [ \n {\n "stage" : "FETCH",\n "filter" : {\n "$and" : [ \n {\n "$or" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$eq" : false\n }\n }, \n {\n "$nor" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$exists" : true\n }\n }\n ]\n }\n ]\n }, \n {\n "is_removed" : {\n "$eq" : false\n }\n }, \n {\n "trk.0.direction" : {\n "$eq" : "ingress"\n }\n }, \n {\n "trk.0.type" : {\n "$eq" : "fax"\n }\n }, \n {\n "trk.0.data.status" : {\n "$in" : [ \n "p_received", \n "success"\n ]\n }\n }\n ]\n },\n "inputStage" : {\n "stage" : "IXSCAN",\n "keyPattern" : {\n "user_id" : 1,\n "date" : -1\n },\n "indexName" : "user_id_date",\n "isMultiKey" : false,\n "isUnique" : false,\n "isSparse" : false,\n "isPartial" : false,\n "indexVersion" : 1,\n "direction" : "forward",\n "indexBounds" : {\n "user_id" : [ \n "[\\"00000000000040008000000000002a5d\\", \\"00000000000040008000000000002a5d\\"]"\n ],\n "date" : [ \n "(\\"2018-01-09 10:36:31\\", \\"1970-01-01 00:00:00\\")"\n ]\n }\n }\n }, \n {\n "stage" : "FETCH",\n "filter" : {\n "$and" : [ \n {\n "$or" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$eq" : false\n }\n }, \n {\n "$nor" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$exists" : true\n }\n }\n ]\n }\n ]\n }, \n {\n "is_removed" : {\n "$eq" : false\n }\n }, \n {\n "trk.0.direction" : {\n "$eq" : "ingress"\n }\n }, \n {\n "trk.0.type" : {\n "$eq" : "fax"\n }\n }, \n {\n "date" : {\n "$lt" : "2018-01-09 10:36:31"\n }\n }, \n {\n "date" : {\n "$gt" : "1970-01-01 00:00:00"\n }\n }, \n {\n "trk.0.data.status" : {\n "$in" : [ \n "p_received", \n "success"\n ]\n }\n }\n ]\n },\n "inputStage" : {\n "stage" : "IXSCAN",\n "keyPattern" : {\n "user_id" : 1,\n "to" : 1,\n "from" : 1\n },\n "indexName" : "user_id_to_from",\n "isMultiKey" : false,\n "isUnique" : false,\n "isSparse" : false,\n "isPartial" : false,\n "indexVersion" : 1,\n "direction" : "forward",\n "indexBounds" : {\n "user_id" : [ \n "[\\"00000000000040008000000000002a5d\\", \\"00000000000040008000000000002a5d\\"]"\n ],\n "to" : [ \n "[MinKey, MaxKey]"\n ],\n "from" : [ \n "[MinKey, MaxKey]"\n ]\n }\n }\n }\n ]\n },\n "executionStats" : {\n "executionSuccess" : true,\n "nReturned" : 4682,\n "executionTimeMillis" : 2072,\n "totalKeysExamined" : 4688,\n "totalDocsExamined" : 4682,\n "executionStages" : {\n "stage" : "FETCH",\n "filter" : {\n "$and" : [ \n {\n "$or" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$eq" : false\n }\n }, \n {\n "$nor" : [ \n {\n "trk.0.extra_data.spam.is_spam" : {\n "$exists" : true\n }\n }\n ]\n }\n ]\n }, \n {\n "trk.0.type" : {\n "$eq" : "fax"\n }\n }, \n {\n "date" : {\n "$lt" : "2018-01-09 10:36:31"\n }\n }, \n {\n "date" : {\n "$gt" : "1970-01-01 00:00:00"\n }\n }\n ]\n },\n "nReturned" : 4682,\n "executionTimeMillisEstimate" : 710,\n "works" : 4897,\n "advanced" : 4682,\n "needTime" : 5,\n "needYield" : 209,\n "saveState" : 234,\n "restoreState" : 234,\n "isEOF" : 1,\n "invalidates" : 1,\n "docsExamined" : 4682,\n "alreadyHasObj" : 0,\n "inputStage" : {\n "stage" : "IXSCAN",\n "nReturned" : 4682,\n "executionTimeMillisEstimate" : 305,\n "works" : 4688,\n "advanced" : 4682,\n "needTime" : 5,\n "needYield" : 0,\n "saveState" : 234,\n "restoreState" : 234,\n "isEOF" : 1,\n "invalidates" : 1,\n "keyPattern" : {\n "user_id" : 1,\n "trk.0.direction" : 1,\n "is_read" : 1,\n "trk.0.data.status" : 1,\n "is_removed" : 1\n },\n "indexName" : "user_id_direction_is_read_status_is_removed",\n "isMultiKey" : false,\n "isUnique" : false,\n "isSparse" : false,\n "isPartial" : false,\n "indexVersion" : 1,\n "direction" : "forward",\n "indexBounds" : {\n "user_id" : [ \n "[\\"00000000000040008000000000002a5d\\", \\"00000000000040008000000000002a5d\\"]"\n ],\n "trk.0.direction" : [ \n "[\\"ingress\\", \\"ingress\\"]"\n ],\n "is_read" : [ \n "[MinKey, MaxKey]"\n ],\n "trk.0.data.status" : [ \n "[\\"p_received\\", \\"p_received\\"]", \n "[\\"success\\", \\"success\\"]"\n ],\n "is_removed" : [ \n "[false, false]"\n ]\n },\n "keysExamined" : 4688,\n "seeks" : 6,\n "dupsTested" : 0,\n "dupsDropped" : 0,\n "seenInvalidated" : 0\n }\n }\n },\n "serverInfo" : {\n "host" : \xe2\x80\x9chs1.mydomain.com\xe2\x80\x9d,\n "port" : 27017,\n "version" : "3.4.10",\n "gitVersion" : "078f28920cb24de0dd479b5ea6c66c644f6326e9"\n },\n "ok" : 1.0\n}\n
Run Code Online (Sandbox Code Playgroud)\n\nkeysExamined
只有4,688!与 30M 文档集合的总数据量相比,这并不算多。当 Mongo 出现多米诺骨牌效应时变慢时,CPU 使用率和内存不高。Mongo 只使用了 40% 的内存。磁盘分区Ext4
是否有帮助。
另一个非常慢的查询的完整细节示例:
\n\n{\n "desc": "conn199276",\n "threadId": "140070259820288",\n "connectionId": 199276,\n "client": "client_server_ip:45590",\n "active": "true",\n "opid": 63869351,\n "secs_running": 36,\n "microsecs_running": 36136211,\n "op": "query",\n "ns": "cuda.call_history",\n "query": {\n "find": "call_history",\n "filter": {\n "is_removed": false,\n "trk.0.extra_data.spam.is_spam": true,\n "trk.0.direction": "ingress",\n "date": {\n "$gt": "1970-01-01 00:00:00",\n "$lt": "4001-01-01 00:00:00"\n },\n "trk.0.extra_data.status": {\n "$in": [\n "success",\n "p_received"\n ]\n },\n "trk.0.type": "clk",\n "owner_id": "00000000000040008000000000003828"\n },\n "sort": {\n "date": -1\n },\n "limit": 31\n },\n "numYields": 6600,\n "locks": {},\n "waitingForLock": "false",\n "lockStats": {\n "Global": {\n "acquireCount": {\n "r": 13200\n }\n },\n "MMAPV1Journal": {\n "acquireCount": {\n "r": 6611\n },\n "acquireWaitCount": {\n "r": 9\n },\n "timeAcquiringMicros": {\n "r": 50854\n }\n },\n "Database": {\n "acquireCount": {\n "r": 6600\n }\n },\n "Collection": {\n "acquireCount": {\n "R": 6600\n },\n "acquireWaitCount": {\n "R": 11\n },\n "timeAcquiringMicros": {\n "R": 163707\n }\n }\n }\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n输出db.stats()
:
rs0:PRIMARY> db.stats()\n{\n "db" : "cuda",\n "collections" : 5,\n "views" : 0,\n "objects" : 55009248,\n "avgObjSize" : 2018.6135346551184,\n "dataSize" : 111042412544,\n "storageSize" : 113055362336,\n "numExtents" : 100,\n "indexes" : 7,\n "indexSize" : 14223460160,\n "fileSize" : 133012914176,\n "nsSizeMB" : 16,\n "extentFreeList" : {\n "num" : 0,\n "totalSize" : 0\n },\n "dataFileVersion" : {\n "major" : 4,\n "minor" : 22\n },\n "ok" : 1\n}\n
Run Code Online (Sandbox Code Playgroud)\n\nmongostat
显示以下结果,我认为故障数量很高:
insert query update delete getmore command flushes mapped vsize res faults qrw arw net_in net_out conn set repl time\n 5 93 4 *0 0 64|0 0 282G 9.11G 26 0|0 0|0 64.3k 187k 481 rs0 PRI Jan 10 06:25:14.476\n *0 107 *0 1 0 58|0 0 282G 9.14G 4 0|0 0|0 51.5k 247k 481 rs0 PRI Jan 10 06:25:15.475\n 2 88 5 *0 0 70|0 0 282G 9.04G 26 0|0 0|0 61.5k 245k 481 rs0 PRI Jan 10 06:25:16.476\n 3 98 2 *0 0 71|0 0 282G 9.12G 6 0|0 0|0 59.6k 274k 481 rs0 PRI Jan 10 06:25:17.474\n 1 105 *0 1 0 82|0 0 282G 9.10G 14 0|0 0|0
归档时间: |
|
查看次数: |
6072 次 |
最近记录: |