Elasticsearch集群'master_not_discovered_exception'

hel*_*y77 8 elasticsearch

我已经安装了Elasticsearch 2.2.3并在2个节点的集群中进行了配置

节点1(elasticsearch.yml)

cluster.name: my-cluster
node.name: node1
bootstrap.mlockall: true
discovery.zen.ping.unicast.hosts: ["ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com", "ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
node.master: true
node.data: true
http.cors.enabled: true
script.inline: false
script.indexed: false
network.bind_host: 0.0.0.0
Run Code Online (Sandbox Code Playgroud)

节点2(elasticsearch.yml)

cluster.name: my-cluster
node.name: node2
bootstrap.mlockall: true
discovery.zen.ping.unicast.hosts: ["ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com", "ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com"]
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.multicast.enabled: false
indices.fielddata.cache.size: "30%"
indices.cache.filter.size: "30%"
node.master: false
node.data: true
http.cors.enabled: true
script.inline: false
script.indexed: false
network.bind_host: 0.0.0.0
Run Code Online (Sandbox Code Playgroud)

如果我明白了curl -XGET 'http://localhost:9200/_cluster/state?pretty'

{
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : null
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}
Run Code Online (Sandbox Code Playgroud)

进入节点1的日志有:

[2016-06-22 13:33:56,167][INFO ][cluster.service          ] [node1] new_master {node1}{Vwj4gI3STr6saeTxKkSqEw}{127.0.0.1}{127.0.0.1:9300}{master=true}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2016-06-22 13:33:56,210][INFO ][http                     ] [node1] publish_address {127.0.0.1:9200}, bound_addresses {[::]:9200}
[2016-06-22 13:33:56,210][INFO ][node                     ] [node1] started
[2016-06-22 13:33:56,221][INFO ][gateway                  ] [-node1] recovered [0] indices into cluster_state
Run Code Online (Sandbox Code Playgroud)

改为进入节点2的日志:

[2016-06-22 13:34:38,419][INFO ][discovery.zen            ] [node2] failed to send join request to master [{node1}{Vwj4gI3STr6saeTxKkSqEw}{127.0.0.1}{127.0.0.1:9300}{master=true}], reason [RemoteTransportException[[node2][127.0.0.1:9300][internal:discovery/zen/join]]; nested: IllegalStateException[Node [{node2}{_YUbBNx9RUuw854PKFe1CA}{127.0.0.1}{127.0.0.1:9300}{master=false}] not master for join request]; ]
Run Code Online (Sandbox Code Playgroud)

哪里出错?

San*_*bar 10

master not discovered异常的根本原因是节点无法在端口 9300 上相互 ping。这需要双向。即 node1 应该能够在 9300 上 ping node2,反之亦然。

注意:Elasticsearch 预留端口 9300-9400 用于集群通信,端口 9200-9300 用于访问 elasticsearch API。

一个简单的 telnet 就可以确认。从 node1 开始,开火telnet node2 9300

如果成功,则从 node2 尝试 next telnet node1 9300

如果出现master not discovered异常,至少上述 telnet 之一将失败。

如果你没有安装 telnet,你甚至可以做一个curl.

希望这可以帮助。


hel*_*y77 9

我用这一行解决了:

network.publish_host: ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com

每个elasticsearch.yml配置文件都必须有这一行与您的主机名

  • 在最近的版本中这是否已更改为“network.host”?我在“elasticsearch-7.10.1”附带的“elasticsearch.yml”中没有看到“network.publish_host” (4认同)

小智 8

如果您使用的是elasticsearch 7

更新elasticsearch.yml文件位于/etc/elasticsearch

node.name: "node-1" 

network.host: ec2-xx-xx-xx-xx.eu-west-1.compute.amazonaws.com

http.port: 9200

cluster.initial_master_nodes: ["node-1"]
Run Code Online (Sandbox Code Playgroud)

这里node.name和第一个值cluster.initial_master_nodes 应该相同


小智 6

这可能是主节点未被发现的原因。如果 EC2 实例位于同一 VPC 下,请在/etc/elasticsearch/elasticsearch.yml中提供私有 IP ,如下所示:

cluster.initial_master_nodes: ["<PRIVATE-IP"]

注意:上述配置更改后,请重新启动elasticsearch服务,例如sudo service elasticsearch stopsudo service elasticsearch stop(如果操作系统是ubuntu)。