Mongo 副本集没有 PRIMARY/SECONDARY,成员是 STARTUP2 和 RECOVERING

Lan*_*don 5 mongodb

我有一个带有 6 个副本集的 mongo 集群。5个可以,一个不行。每个副本集有三个成员。这是rs.status()它的原因:

{
    "set" : "rs_5",
    "date" : ISODate("2015-12-16T02:37:39Z"),
    "myState" : 5,
    "members" : [
        {
            "_id" : 0,
            "name" : "mongo_rs_5_member_1:27018",
            "health" : 1,
            "state" : 5,
            "stateStr" : "STARTUP2",
            "uptime" : 33600,
            "optime" : Timestamp(0, 0),
            "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
            "lastHeartbeat" : ISODate("2015-12-16T02:37:38Z"),
            "lastHeartbeatRecv" : ISODate("2015-12-16T02:37:37Z"),
            "pingMs" : 0,
            "lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync"
        },
        {
            "_id" : 1,
            "name" : "mongo_rs_5_member_2:27019",
            "health" : 1,
            "state" : 3,
            "stateStr" : "RECOVERING",
            "uptime" : 33842,
            "optime" : Timestamp(1449898728, 18),
            "optimeDate" : ISODate("2015-12-12T05:38:48Z"),
            "lastHeartbeat" : ISODate("2015-12-16T02:37:37Z"),
            "lastHeartbeatRecv" : ISODate("2015-12-16T02:37:37Z"),
            "pingMs" : 3,
            "lastHeartbeatMessage" : "still syncing, not yet to minValid optime 566bb328:3"
        },
        {
            "_id" : 2,
            "name" : "mongo_rs_5_member_3:27020",
            "health" : 1,
            "state" : 5,
            "stateStr" : "STARTUP2",
            "uptime" : 33845,
            "optime" : Timestamp(1449898728, 18),
            "optimeDate" : ISODate("2015-12-12T05:38:48Z"),
            "errmsg" : "still syncing, not yet to minValid optime 566bb327:1",
            "self" : true
        }
    ],
    "ok" : 1
}
Run Code Online (Sandbox Code Playgroud)

在日志中,我看到如下内容:

Wed Dec 16 02:40:34.033 [rsMgr] replSet I don't see a primary and I can't elect myself
Run Code Online (Sandbox Code Playgroud)

Tue Dec 15 21:41:27.686 [rsSync] replSet initial sync need a member to be primary or secondary to do our initial sync
Run Code Online (Sandbox Code Playgroud)

这是 rs.conf():

{
    "_id" : "rs_5",
    "version" : 125967,
    "members" : [
        {
            "_id" : 0,
            "host" : "mongo_rs_5_member_1:27018",
            "priority" : 3
        },
        {
            "_id" : 1,
            "host" : "mongo_rs_5_member_2:27019",
            "priority" : 2
        },
        {
            "_id" : 2,
            "host" : "mongo_rs_5_member_3:27020"
        }
    ]
}
Run Code Online (Sandbox Code Playgroud)

好几天都是这样。cpu 和网络没有显示任何实际运动,表明没有发生任何事情。显然,我不想丢失数据,我需要做什么才能让它恢复到健康的 PRIMARY/SECONDARY/SECONDARY 副本集。

Lan*_*don 5

我能够通过Breaking the Mirror解决这个问题。本质上,我选择了其中一个成员,将其关闭,删除 /data/local* 文件,打开它,然后执行rs.initiate(). 在这一点上,我是 1(我自己)和主要(显然)的副本集。然后,对于其他两个人,我将它们关闭,擦除他们的整个 /data/* 文件并重新打开它们。从最初的主要成员中,我只是添加了两个带有rs.add("mongo_rs_5_member_1:27018")和 的新人rs.add("mongo_rs_5_member_2:27019")。然后主将所有内容同步给其他人(很多小时),副本集是健康的。相关应用程序中不再有错误。