mongodb和非常高的锁定百分比与低吞吐量

Dav*_*142 4 linux concurrency locking mongodb

我们第一次使用mongodb时遇到了一些问题:)以下是一些事实:

  • 95 +%的锁定百分比
  • 服务器是一个拥有2个内核,6 GB内存的虚拟机,在mongodb的快速NAS上具有NFS v3共享(noatime).
  • centos 5.7 x86_64,mongo 2.0.2,php-pecl-mongo 1.2.6(不是最新版本,但很快就会更新:)
  • mongo(当前/错误地)配置为没有slave的单个master
  • db从今天开始创建.20个Web服务器正在写入它(使用更新)
  • 不确定向服务器发送了多少次更新,但处理的数量非常少
  • 我不确定这是否是一个索引问题:如何诊断?
  • 当前磁盘数据(包括oplog,journal ...)小于600 MB
  • dstat:
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
 46   1  53   0   0   0|   0  4096B|  42k   19k|   0     0 | 317  5328
 48   1  52   0   0   1|   0    92k|  46k 7590B|   0     0 | 321  5308
 50   2  48   0   0   0|   0     0 |  39k 7218B|   0     0 | 304  5359
 47   1  51   0   0   1|   0     0 |  47k   10k|   0     0 | 332  5679
 46   1  52   0   0   0|   0     0 |  44k   15k|   0     0 | 319  5099
  • nfsiostat为空(0 ops/s)(当然iostat是相同的)
  • mongostat:
insert  query update delete getmore command flushes mapped  vsize    res faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn repl       time
     0      0      0      0       0       1       0  1.41g  8.39g   242m      0     96.2          0    0|3280  1|5322    62b     1k  5324    M   21:11:50
     0      0      0      0       0       1       0  1.41g  8.39g   242m      0     96.5          0    0|3204  1|5322    62b     1k  5324    M   21:11:51
     0      0      0      0       0       1       0  1.41g  8.39g   242m      0       96          0    1|3351  1|5322    62b     1k  5324    M   21:11:52
     0      0      1      0       0       1       0  1.41g  8.39g   242m      0     96.9          0    0|3251  1|5322   485b     1k  5324    M   21:11:53
     0      0      0      0       1       1       0  1.41g  8.39g   242m      0     95.6          0    0|3280  1|5322   112b     1k  5324    M   21:11:54
  • db.serverStatus()
{
        "host" : "foo001",
        "version" : "2.0.2",
        "process" : "mongod",
        "uptime" : 21370,
        "uptimeEstimate" : 18626,
        "localTime" : ISODate("2012-02-23T20:20:59.589Z"),
        "globalLock" : {
                "totalTime" : 21369761258,
                "lockTime" : 19450568051,
                "ratio" : 0.9101911722911022,
                "currentQueue" : {
                        "total" : 3570,
                        "readers" : 0,
                        "writers" : 3570
                },
                "activeClients" : {
                        "total" : 5500,
                        "readers" : 1,
                        "writers" : 5499
                }
        },
        "mem" : {
                "bits" : 64,
                "resident" : 255,
                "virtual" : 8782,
                "supported" : true,
                "mapped" : 1440,
                "mappedWithJournal" : 2880
        },
        "connections" : {
                "current" : 5501,
                "available" : 4099
        },
        "extra_info" : {
                "note" : "fields vary by platform",
                "heap_usage_bytes" : 81930736,
                "page_faults" : 2916
        },
        "indexCounters" : {
                "btree" : {
                        "accesses" : 2377,
                        "hits" : 2377,
                        "misses" : 0,
                        "resets" : 0,
                        "missRatio" : 0
                }
        },
        "backgroundFlushing" : {
                "flushes" : 356,
                "total_ms" : 2372,
                "average_ms" : 6.662921348314606,
                "last_ms" : 0,
                "last_finished" : ISODate("2012-02-23T20:20:56.446Z")
        },
        "cursors" : {
                "totalOpen" : 5500,
                "clientCursors_size" : 5500,
                "timedOut" : 0,
                "totalNoTimeout" : 5499
        },
        "network" : {
                "bytesIn" : 51373772,
                "bytesOut" : 51176411,
                "numRequests" : 176017
        },
        "repl" : {
                "ismaster" : true
        },
        "opcounters" : {
                "insert" : 0,
                "query" : 25,
                "update" : 142157,
                "delete" : 0,
                "getmore" : 39053,
                "command" : 284
        },
        "asserts" : {
                "regular" : 0,
                "warning" : 0,
                "msg" : 0,
                "user" : 0,
                "rollovers" : 0
        },
        "writeBacksQueued" : false,
        "dur" : {
                "commits" : 19,
                "journaledMB" : 0,
                "writeToDataFilesMB" : 0,
                "compression" : 0,
                "commitsInWriteLock" : 0,
                "earlyCommits" : 0,
                "timeMs" : {
                        "dt" : 3083,
                        "prepLogBuffer" : 0,
                        "writeToJournal" : 0,
                        "writeToDataFiles" : 0,
                        "remapPrivateView" : 0
                }
        },
        "ok" : 1
}
  • 我们的db:
{
        "db" : "mydb",
        "collections" : 6,
        "objects" : 119174,
        "avgObjSize" : 323.99872455401345,
        "dataSize" : 38612224,
        "storageSize" : 57286656,
        "numExtents" : 26,
        "indexes" : 4,
        "indexSize" : 3899952,
        "fileSize" : 469762048,
        "nsSizeMB" : 16,
        "ok" : 1
}

任何提示?

问候,

D.

PS:我也把它交叉发布给了mongo-user

thk*_*ala 5

请,请,不要使用NFS作为后端数据库.有很多问题,尤其是锁定问题,尤其 NFS <v4.由于它们不仅仅是性能问题,因此甚至可能不考虑NFS.

我首先将我的数据库移动到本地磁盘,看看是否能解决性能问题 - 我怀疑它会...

编辑:

MongoDB人似乎同意,即使有点简洁,也不建议使用NFS .