keepalived - 随机重选

mil*_*dos 1 failover high-availability vrrp keepalived

我们已经设置了 3 个运行keepalived 的服务器。我们开始注意到一些我们无法解释的随机连任发生,所以我来到这里寻求建议。

这是我们的配置:

掌握:

global_defs {
  notification_email {
    webops@example.com
  }
  notification_email_from keepalived@hostname
  smtp_server example.com:587
  smtp_connect_timeout 30
  router_id some_rate
}


vrrp_script chk_nginx {
  script "killall -0 nginx"
  interval 2
  weight 2
}

vrrp_instance VIP_61 {
  interface bond0
  virtual_router_id 61
  state MASTER
  priority 100
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass PASSWORD
  }
  virtual_ipaddress {
    X.X.X.X
    X.X.X.X
    X.X.X.X
  }
  track_script {
    chk_nginx
  }
}
Run Code Online (Sandbox Code Playgroud)

备份1:

global_defs {
  notification_email {
    webops@example.com
  }
  notification_email_from keepalived@hostname
  smtp_server example.com:587
  smtp_connect_timeout 30
  router_id some_rate
}


vrrp_script chk_nginx {
  script "killall -0 nginx"
  interval 2
  weight 2
}

vrrp_instance VIP_61 {
  interface bond0
  virtual_router_id 61
  state MASTER
  priority 99
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass PASSWORD
  }
  virtual_ipaddress {
    X.X.X.X
    X.X.X.X
    X.X.X.X
  }
  track_script {
    chk_nginx
  }
}
Run Code Online (Sandbox Code Playgroud)

备份2:

    global_defs {
      notification_email {
        webops@example.com
      }
      notification_email_from keepalived@hostname
      smtp_server example.com:587
      smtp_connect_timeout 30
      router_id some_rate
    }


vrrp_script chk_nginx {
  script "killall -0 nginx"
  interval 2
  weight 2
}

vrrp_instance VIP_61 {
  interface bond0
  virtual_router_id 61
  state MASTER
  priority 98
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass PASSWORD
  }
  virtual_ipaddress {
    X.X.X.X
    X.X.X.X
    X.X.X.X
  }
  track_script {
    chk_nginx
  }
}
Run Code Online (Sandbox Code Playgroud)

我时不时地看到这种情况发生(在日志中记录):

掌握:

Jan  6 18:30:15 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
Jan  6 18:30:16 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
Jan  6 18:32:37 lb-public01 Keepalived_vrrp[24380]: VRRP_Instance(VIP_61) Received lower prio advert, forcing new election
Run Code Online (Sandbox Code Playgroud)

备份1:

Jan  6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE
Jan  6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert
Jan  6 18:30:16 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
Jan  6 18:32:37 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) forcing a new MASTER election
Jan  6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Transition to MASTER STATE
Jan  6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Received higher prio advert
Jan  6 18:32:38 lb-public02 Keepalived_vrrp[26235]: VRRP_Instance(VIP_61) Entering BACKUP STATE
Run Code Online (Sandbox Code Playgroud)

备份2:

Jan  6 18:32:36 lb-public03 Keepalived_vrrp[14255]: VRRP_Script(chk_nginx) succeeded
Jan  6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Transition to MASTER STATE
Jan  6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Received higher prio advert
Jan  6 18:32:37 lb-public03 Keepalived_vrrp[14255]: VRRP_Instance(VIP_61) Entering BACKUP STATE
Run Code Online (Sandbox Code Playgroud)

因此 MASTER 收到 LOWER PRIO 广告并开始 NEW 选举。为什么 ?看起来 BACKUP 会在短时间内转换为 MASTER(基于日志),然后故障恢复到 BACKUP 状态。我很无能,因为为什么会发生这种情况,所以任何提示都非常受欢迎。

另外,我发现keepalived中有一个单播补丁,但是我不清楚它是否支持1个以上的单播对等点——在我们的例子中,我们有一个由3台机器组成的集群,所以我们需要1个以上的单播对等点。

对这些问题的任何提示将不胜感激!

小智 5

问题是您对备份节点使用默认状态 MASTER。他们应该说明 BACKUP。

  vrrp_instance VIP_61 {
      interface bond0
      virtual_router_id 61
      state BACKUP
      priority 98
      ...
Run Code Online (Sandbox Code Playgroud)

希望这能解决你的谜团。