mil*_*dos 1 failover high-availability vrrp keepalived
我们已经设置了 3 个运行keepalived 的服务器。我们开始注意到一些我们无法解释的随机连任发生,所以我来到这里寻求建议。
这是我们的配置:
掌握:
global_defs {
notification_email {
webops@example.com
}
notification_email_from keepalived@hostname
smtp_server example.com:587
smtp_connect_timeout 30
router_id some_rate
}
vrrp_script chk_nginx {
script "killall -0 nginx"
interval 2
weight 2
}
vrrp_instance VIP_61 {
interface bond0
virtual_router_id 61
state MASTER
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass PASSWORD
}
virtual_ipaddress {
X.X.X.X
X.X.X.X
X.X.X.X
}
track_script {
chk_nginx
}
}
Run Code Online (Sandbox Code Playgroud)
备份1:
global_defs {
notification_email {
webops@example.com
}
notification_email_from keepalived@hostname
smtp_server example.com:587
smtp_connect_timeout 30
router_id some_rate
}
vrrp_script chk_nginx {
script "killall -0 nginx"
interval 2
weight 2
}
vrrp_instance VIP_61 {
interface bond0
virtual_router_id 61
state MASTER
priority 99
advert_int 1
authentication {
auth_type PASS
auth_pass PASSWORD
}
virtual_ipaddress {
X.X.X.X
X.X.X.X
X.X.X.X
}
track_script {
chk_nginx
}
}
Run Code Online (Sandbox Code Playgroud)
备份2:
global_defs {
notification_email {
webops@example.com
}
notification_email_from keepalived@hostname
smtp_server example.com:587
smtp_connect_timeout 30
router_id some_rate
}
vrrp_script chk_nginx {
script "killall -0 nginx"
interval 2
weight 2
}
vrrp_instance VIP_61 {
interface bond0
virtual_router_id 61
state MASTER
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass PASSWORD
}
virtual_ipaddress {
X.X.X.X
X.X.X.X
X.X.X.X
}
track_script {
chk_nginx
}
}
Run Code Online (Sandbox Code Playgroud)
我时不时地看到这种情况发生(在日志中记录):
掌握:
Jan 6 18:30:15 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
Jan 6 18:30:16 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
Jan 6 18:32:37 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
Run Code Online (Sandbox Code Playgroud)
备份1:
Jan 6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE
Jan 6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert
Jan 6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
Jan 6 18:32:37 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) forcing a new MASTER election
Jan 6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE
Jan 6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert
Jan 6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
Run Code Online (Sandbox Code Playgroud)
备份2:
Jan 6 18:32:36 lb-public03 Keepalived_vrrp[14255]: VRRP_Script(chk_nginx) succeeded
Jan 6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Transition to MASTER STATE
Jan 6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Received higher prio advert
Jan 6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Entering BACKUP STATE
Run Code Online (Sandbox Code Playgroud)
因此 MASTER 收到 LOWER PRIO 广告并开始 NEW 选举。为什么 ?看起来 BACKUP 会在短时间内转换为 MASTER(基于日志),然后故障恢复到 BACKUP 状态。我很无能,因为为什么会发生这种情况,所以任何提示都非常受欢迎。
另外,我发现keepalived中有一个单播补丁,但是我不清楚它是否支持1个以上的单播对等点——在我们的例子中,我们有一个由3台机器组成的集群,所以我们需要1个以上的单播对等点。
对这些问题的任何提示将不胜感激!
小智 5
问题是您对备份节点使用默认状态 MASTER。他们应该说明 BACKUP。
vrrp_instance VIP_61 {
interface bond0
virtual_router_id 61
state BACKUP
priority 98
...
Run Code Online (Sandbox Code Playgroud)
希望这能解决你的谜团。