停止在mongo上设置副本,主要进入恢复状态

MrE*_*ant 5 mongodb mongodb-replica-set

当我停止副本集的节点并再次启动它们时,主节点进入"恢复"状态.

我创建了一个副本集,未经授权运行.为了使用授权,我添加了用户"db.createUser(...)",并在配置文件中启用了授权:

security:
   authorization: "enabled"
Run Code Online (Sandbox Code Playgroud)

在停止副本集(甚至重新启动集群而不添加安全性参数)之前,rs.status()显示:

{
        "set" : "REPLICASET",
        "date" : ISODate("2016-09-08T09:57:50.335Z"),
        "myState" : 1,
        "term" : NumberLong(7),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "members" : [
                {
                        "_id" : 0,
                        "name" : "192.168.1.167:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 301,
                        "optime" : {
                                "ts" : Timestamp(1473328390, 2),
                                "t" : NumberLong(7)
                        },
                        "optimeDate" : ISODate("2016-09-08T09:53:10Z"),
                        "electionTime" : Timestamp(1473328390, 1),
                        "electionDate" : ISODate("2016-09-08T09:53:10Z"),
                        "configVersion" : 1,
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "192.168.1.168:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 295,
                        "optime" : {
                                "ts" : Timestamp(1473328390, 2),
                                "t" : NumberLong(7)
                        },
                        "optimeDate" : ISODate("2016-09-08T09:53:10Z"),
                        "lastHeartbeat" : ISODate("2016-09-08T09:57:48.679Z"),
                        "lastHeartbeatRecv" : ISODate("2016-09-08T09:57:49.676Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "192.168.1.167:27017",
                        "configVersion" : 1
                },
                {
                        "_id" : 2,
                        "name" : "192.168.1.169:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 295,
                        "optime" : {
                                "ts" : Timestamp(1473328390, 2),
                                "t" : NumberLong(7)
                        },
                        "optimeDate" : ISODate("2016-09-08T09:53:10Z"),
                        "lastHeartbeat" : ISODate("2016-09-08T09:57:48.680Z"),
                        "lastHeartbeatRecv" : ISODate("2016-09-08T09:57:49.054Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "192.168.1.168:27017",
                        "configVersion" : 1
                }
        ],
        "ok" : 1
}
Run Code Online (Sandbox Code Playgroud)

为了开始使用此配置,我已按如下方式停止每个节点:

[root@n--- etc]# mongo --port 27017 --eval 'db.adminCommand("shutdown")'
MongoDB shell version: 3.2.9
connecting to: 127.0.0.1:27017/test
2016-09-02T14:26:15.784+0200 W NETWORK  [thread1] Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
2016-09-02T14:26:15.785+0200 E QUERY    [thread1] Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed :
connect@src/mongo/shell/mongo.js:231:14
Run Code Online (Sandbox Code Playgroud)

关闭后,我通过检查输出确认该进程不存在ps -ax | grep mongo.

但是当我再次启动节点并使用我的凭据登录时,rs.status()现在指示:

{
        "set" : "REPLICASET",
        "date" : ISODate("2016-09-08T13:19:12.963Z"),
        "myState" : 3,
        "term" : NumberLong(7),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "members" : [
                {
                        "_id" : 0,
                        "name" : "192.168.1.167:27017",
                        "health" : 1,
                        "state" : 3,
                        "stateStr" : "RECOVERING",
                        "uptime" : 42,
                        "optime" : {
                                "ts" : Timestamp(1473340490, 6),
                                "t" : NumberLong(7)
                        },
                        "optimeDate" : ISODate("2016-09-08T13:14:50Z"),
                        "infoMessage" : "could not find member to sync from",
                        "configVersion" : 1,
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "192.168.1.168:27017",
                        "health" : 0,
                        "state" : 6,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2016-09-08T13:19:10.553Z"),
                        "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
                        "pingMs" : NumberLong(0),
                        "authenticated" : false,
                        "configVersion" : -1
                },
                {
                        "_id" : 2,
                        "name" : "192.168.1.169:27017",
                        "health" : 0,
                        "state" : 6,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2016-09-08T13:19:10.552Z"),
                        "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
                        "pingMs" : NumberLong(0),
                        "authenticated" : false,
                        "configVersion" : -1
                }
        ],
        "ok" : 1
}
Run Code Online (Sandbox Code Playgroud)

为什么?也许停机并不是阻止mongod的好方法; 但我也测试了使用'kill pid',但重启最终处于相同的状态.

在这种状态下,我不知道如何修复集群; 我又重新开始了(删除dbpath文件并重新配置副本集); 我试过'--repair'但没有奏效.

关于我的系统的信息:

  • Mongo版本:3.2
  • 我以root身份开始这个过程,也许它应该是'mongod'用户?
  • 这是我的开始命令: mongod --conf /etc/mongod.conf
  • keyFile配置不起作用; 如果我添加"--keyFile/path/to/file"显示:
    " 即将分叉子进程,等待服务器准备连接. "此文件具有所有权限,但不能使用keyFile.
  • 一台机器上mongod.conf的"net.bindIp"配置示例:

    net:
      port: 27017
      bindIp: 127.0.0.1,192.168.1.167
    
    Run Code Online (Sandbox Code Playgroud)

Sal*_*eem 1

注意:此解决方案特定于 Windows,但可以轻松移植到基于 *nix 的系统。

您需要按顺序执行步骤。首先,启动您的 mongod 实例。

start "29001" mongod --dbpath "C:\data\db\r1" --port 29001
start "29002" mongod --dbpath "C:\data\db\r2" --port 29002
start "29003" mongod --dbpath "C:\data\db\r3" --port 29003 
Run Code Online (Sandbox Code Playgroud)

使用mongo连接到每个节点并创建管理员用户。我更喜欢创建超级用户。

> use admin
> db.createUser({user: "root", pwd: "123456", roles:["root"]})
Run Code Online (Sandbox Code Playgroud)

您可以根据需要创建其他用户。

创建密钥文件。请参阅文档以了解有效的密钥文件内容。

注意:在基于 *nix 的系统上,将密钥文件的 chmod 设置为 400

就我而言,我创建了密钥文件

echo mysecret==key > C:\data\key\key.txt
Run Code Online (Sandbox Code Playgroud)

现在重新启动 MongoDB 服务器并启用--keyFile--replSet标志。

start "29001" mongod --dbpath "C:\data\db\r1" --port 29001 --replSet "rs1" --keyFile C:\data\key\key.txt
start "29002" mongod --dbpath "C:\data\db\r2" --port 29002 --replSet "rs1" --keyFile C:\data\key\key.txt
start "29003" mongod --dbpath "C:\data\db\r3" --port 29003 --replSet "rs1" --keyFile C:\data\key\key.txt
Run Code Online (Sandbox Code Playgroud)

所有mongod实例启动并运行后,通过身份验证连接任何一个实例。

mongo --port 29001 -u "root" -p "123456" --authenticationDatabase "admin"
Run Code Online (Sandbox Code Playgroud)

启动复制集,

> use admin
> rs.initiate()
> rs1:PRIMARY> rs.add("localhost:29002")
{ "ok" : 1 }
> rs1:PRIMARY> rs.add("localhost:29003")
{ "ok" : 1 }
Run Code Online (Sandbox Code Playgroud)

注意:您可能需要替换localhost为计算机名称或 IP 地址。