Elasticsearch nodes did not elect master even after failure to discover master

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Elasticsearch nodes did not elect master even after failure to discover master

vachan_da
This post has NOT been accepted by the mailing list yet.
We recently faced an AWS network outage on one of the 3 nodes in our ES cluster. The node ES2 announced itself as the master and added the two nodes ES1 and ES3 respectively, to the cluster. The other nodes though, failed to
register to the master owing to network issue and neither did they elect a new master nor declare themselves as master.

We use ES version 1.4.4, on Ubuntu 14.04LTS hosted on AWS m4 Large instance.

Our ES configuration is:

Cluster.name: example
bootstrap.mlockall: true
discovery.zen.minimum_master_nodes: 1
discovery.type: ec2
discovery.zen.ping.multicast.enabled: false
discovery.ec2.groups: es

This is the log entry where in the master node (ES2) detect and adds ES3:

[2015-10-21 15:59:58,612][INFO ][cluster.service          ] [ES2.localdomain] added {[ES3.localdomain][Vbbga0gMTVS43SIu_QLpUw][ES3.localdomain      ][inet[/10.0.0.103:9300]]{aws_availability_zone=ap-southeast-1a, max_local_storage_nodes=1},}, reason: zen-disco-receive(join from node[[ES1.localdomain][X9u5UwmYSg      -pGB7PdGNFdA][ES1.localdomain][inet[/10.0.0.219:9300]]{aws_availability_zone=ap-southeast-1a, max_local_storage_nodes=1}])

This is the log entry in the ES3, which keeps retrying for master over and over again :-

[2015-10-21 16:00:06,181][DEBUG][action.admin.cluster.health] [ES3.localdomain] no known master node, scheduling a retry
[2015-10-21 16:00:36,182][DEBUG][action.admin.cluster.health] [ES3.localdomain] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-21 16:01:06,190][DEBUG][action.admin.cluster.health] [ES3.localdomain] no known master node, scheduling a retry
[2015-10-21 16:01:36,191][DEBUG][action.admin.cluster.health] [ES3.localdomain] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]

Any idea why master re-election didn’t happen? And why ES3 was trying to detect master instead of claiming itself as new master?
Reply | Threaded
Open this post in threaded view
|

Re: Elasticsearch nodes did not elect master even after failure to discover master

mkbigdata
This post has NOT been accepted by the mailing list yet.
it is most likely the network issue or couldnt locate correct ip address (i think publish_host). also make sure only one elastic instance running



On Thursday, October 22, 2015 11:46 AM, vachan_da [via Elasticsearch Users] <[hidden email]> wrote:


We recently faced an AWS network outage on one of the 3 nodes in our ES cluster. The node ES2 announced itself as the master and added the two nodes ES1 and ES3 respectively, to the cluster. The other nodes though, failed to
register to the master owing to network issue and neither did they elect a new master nor declare themselves as master.

We use ES version 1.4.4, on Ubuntu 14.04LTS hosted on AWS m4 Large instance.

Our ES configuration is:

Cluster.name: example
bootstrap.mlockall: true
discovery.zen.minimum_master_nodes: 1
discovery.type: ec2
discovery.zen.ping.multicast.enabled: false
discovery.ec2.groups: es

This is the log entry where in the master node (ES2) detect and adds ES3:

[2015-10-21 15:59:58,612][INFO ][cluster.service          ] [ES2.localdomain] added {[ES3.localdomain][Vbbga0gMTVS43SIu_QLpUw][ES3.localdomain      ][inet[/10.0.0.103:9300]]{aws_availability_zone=ap-southeast-1a, max_local_storage_nodes=1},}, reason: zen-disco-receive(join from node[[ES1.localdomain][X9u5UwmYSg      -pGB7PdGNFdA][ES1.localdomain][inet[/10.0.0.219:9300]]{aws_availability_zone=ap-southeast-1a, max_local_storage_nodes=1}])

This is the log entry in the ES3, which keeps retrying for master over and over again :-

[2015-10-21 16:00:06,181][DEBUG][action.admin.cluster.health] [ES3.localdomain] no known master node, scheduling a retry
[2015-10-21 16:00:36,182][DEBUG][action.admin.cluster.health] [ES3.localdomain] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2015-10-21 16:01:06,190][DEBUG][action.admin.cluster.health] [ES3.localdomain] no known master node, scheduling a retry
[2015-10-21 16:01:36,191][DEBUG][action.admin.cluster.health] [ES3.localdomain] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]

Any idea why master re-election didn’t happen? And why ES3 was trying to detect master instead of claiming itself as new master?



To start a new topic under Elasticsearch Users, email [hidden email]
To unsubscribe from Elasticsearch Users, click here.
NAML


Reply | Threaded
Open this post in threaded view
|

Re: Elasticsearch nodes did not elect master even after failure to discover master

vachan_da
This post has NOT been accepted by the mailing list yet.
There is only one elasticsearch instance running on each of the machines. I know there was network issue hence connection were timing out, but the question is why didn't ES-3 declare itself as master.