Quantcast

Nodes fail to join cluster - potential split brain scenario

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Nodes fail to join cluster - potential split brain scenario

Ivan Brusic
8 node cluster running 0.20.0RC1, MINIMUM_MASTER_NODES is set to 5.

At a certain point, 2 nodes (search7 and search8) left the cluster. The reason is unknown, it occurred while increasing the replica count on a new index, but I am not focused on that right now. Stopped the process on both search7 and search8 and started them up one at a time.

Upon restarting search7, it seemed to think that search8 was the master. Since the process was down, it did not join the cluster.

[2013-02-25 09:09:05,108][WARN ][discovery.zen ] [search7] failed to connect to master [[search8][GYhoDKLWRFOCy7KtUgVVQg][inet[/ipaddress:9300]]], retrying...
org.elasticsearch.transport.ConnectTransportException: [search8][inet[/ipaddress:9300]] connect_timeout[5s]
Next I started search8 and attempted to restart search7. Ignoring search8's logs for now, search7 now cannot join the cluster for other reasons (not master):

[2013-02-25 09:10:47,816][INFO ][discovery.zen ] [search7] failed to send join request to master [[search8][GYhoDKLWRFOCy7KtUgVVQg][inet[/ipaddress:9300]]], reason [org.elasticsearch.transport.RemoteTransportException: [search8][inet[/ipaddress:9300]][discovery/zen/join]; org.elasticsearch.ElasticSearchIllegalStateException: Node [[search8][feMQtDFOTs2xyh0xcTXdkA][inet[/ipaddress:9300]]] not master for join request from [[search7][4lEhStfwQDKuHpGvHeU-hQ][inet[/ipaddress:9300]]]]
[2013-02-25 09:10:47,816][TRACE][discovery.zen ] [search7] detailed failed reason
org.elasticsearch.transport.RemoteTransportException: [search8][inet[/ipaddress:9300]][discovery/zen/join]
Caused by: org.elasticsearch.ElasticSearchIllegalStateException: Node [[search8][feMQtDFOTs2xyh0xcTXdkA][inet[/ipaddress:9300]]] not master for join request from [[search7][4lEhStfwQDKuHpGvHeU-hQ][inet[/ipaddress:9300]]]


Focusing only on search7 for now. It is sending ping requests to all nodes in the network cluster, and they all seem to respond that search8 is the master. The other 6 nodes are forming an ES cluster, without search8 as the master. Why are they returning search8 as the master? 

If this is a split brain scenario, why didn't setting the minimum master nodes help? How can someone recover from this scenario? We deleted the new index, and the cluster returned to a green state. I assume that deleting the data directories on search7 and search8 would have made the cluster go into a yellow state. What does it take for the master election process to start?

Cheers,

Ivan


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Ivan Brusic
More split-brain weirdness on another cluster. Four node cluster, no replicas on the indices. Cluster is in a red-state because one node dropped out and there are no replicas, so a couple of shards were missing. Using the cluster API, servers 1,2,3 think they form a cluster. Server 4 thinks the cluster is formed between 1,2,4. Restarting server 4 returned everything back to a green state.

Help on red-state clusters would be appreciated.

-- 
Ivan


On Mon, Feb 25, 2013 at 11:08 AM, Ivan Brusic <[hidden email]> wrote:
8 node cluster running 0.20.0RC1, MINIMUM_MASTER_NODES is set to 5.

At a certain point, 2 nodes (search7 and search8) left the cluster. The reason is unknown, it occurred while increasing the replica count on a new index, but I am not focused on that right now. Stopped the process on both search7 and search8 and started them up one at a time.

Upon restarting search7, it seemed to think that search8 was the master. Since the process was down, it did not join the cluster.

[2013-02-25 09:09:05,108][WARN ][discovery.zen ] [search7] failed to connect to master [[search8][GYhoDKLWRFOCy7KtUgVVQg][inet[/ipaddress:9300]]], retrying...
org.elasticsearch.transport.ConnectTransportException: [search8][inet[/ipaddress:9300]] connect_timeout[5s]
Next I started search8 and attempted to restart search7. Ignoring search8's logs for now, search7 now cannot join the cluster for other reasons (not master):

[2013-02-25 09:10:47,816][INFO ][discovery.zen ] [search7] failed to send join request to master [[search8][GYhoDKLWRFOCy7KtUgVVQg][inet[/ipaddress:9300]]], reason [org.elasticsearch.transport.RemoteTransportException: [search8][inet[/ipaddress:9300]][discovery/zen/join]; org.elasticsearch.ElasticSearchIllegalStateException: Node [[search8][feMQtDFOTs2xyh0xcTXdkA][inet[/ipaddress:9300]]] not master for join request from [[search7][4lEhStfwQDKuHpGvHeU-hQ][inet[/ipaddress:9300]]]]
[2013-02-25 09:10:47,816][TRACE][discovery.zen ] [search7] detailed failed reason
org.elasticsearch.transport.RemoteTransportException: [search8][inet[/ipaddress:9300]][discovery/zen/join]
Caused by: org.elasticsearch.ElasticSearchIllegalStateException: Node [[search8][feMQtDFOTs2xyh0xcTXdkA][inet[/ipaddress:9300]]] not master for join request from [[search7][4lEhStfwQDKuHpGvHeU-hQ][inet[/ipaddress:9300]]]


Focusing only on search7 for now. It is sending ping requests to all nodes in the network cluster, and they all seem to respond that search8 is the master. The other 6 nodes are forming an ES cluster, without search8 as the master. Why are they returning search8 as the master? 

If this is a split brain scenario, why didn't setting the minimum master nodes help? How can someone recover from this scenario? We deleted the new index, and the cluster returned to a green state. I assume that deleting the data directories on search7 and search8 would have made the cluster go into a yellow state. What does it take for the master election process to start?

Cheers,

Ivan



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Clinton Gormley-2
On Tue, 2013-02-26 at 14:59 -0800, Ivan Brusic wrote:
> More split-brain weirdness on another cluster. Four node cluster, no
> replicas on the indices. Cluster is in a red-state because one node
> dropped out and there are no replicas, so a couple of shards were
> missing. Using the cluster API, servers 1,2,3 think they form a
> cluster. Server 4 thinks the cluster is formed between 1,2,4.
> Restarting server 4 returned everything back to a green state.

How long are you waiting before asking server 4 for the cluster state?
It doesn't fail immediately, in case there is just a temporary network
outage which it can recover from, but after a while it should recognise
that it is no longer part of the cluster.

Try running all of the logs through es2unix using the "lifecycle" option
https://github.com/elasticsearch/es2unix

clint



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Ivan Brusic
Waited an hour before asking for its state. In both cases, ElasticSearch fails to recognize the correct cluster. Nodes removing themselves is ultimately the bigger concern, but harder to debug.


On Wed, Feb 27, 2013 at 2:19 AM, Clinton Gormley <[hidden email]> wrote:
On Tue, 2013-02-26 at 14:59 -0800, Ivan Brusic wrote:
> More split-brain weirdness on another cluster. Four node cluster, no
> replicas on the indices. Cluster is in a red-state because one node
> dropped out and there are no replicas, so a couple of shards were
> missing. Using the cluster API, servers 1,2,3 think they form a
> cluster. Server 4 thinks the cluster is formed between 1,2,4.
> Restarting server 4 returned everything back to a green state.

How long are you waiting before asking server 4 for the cluster state?
It doesn't fail immediately, in case there is just a temporary network
outage which it can recover from, but after a while it should recognise
that it is no longer part of the cluster.

Try running all of the logs through es2unix using the "lifecycle" option
https://github.com/elasticsearch/es2unix

clint



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Clinton Gormley-2
Hi Ivan

On Wed, 2013-02-27 at 07:38 -0800, Ivan Brusic wrote:
> Waited an hour before asking for its state. In both cases,
> ElasticSearch fails to recognize the correct cluster. Nodes removing
> themselves is ultimately the bigger concern, but harder to debug.

We really need more information here. What version of ES are you using?
Did you manage to run the logs through 'lifecycle' on es2unix?

clint

>
>
> On Wed, Feb 27, 2013 at 2:19 AM, Clinton Gormley
> <[hidden email]> wrote:
>         On Tue, 2013-02-26 at 14:59 -0800, Ivan Brusic wrote:
>         > More split-brain weirdness on another cluster. Four node
>         cluster, no
>         > replicas on the indices. Cluster is in a red-state because
>         one node
>         > dropped out and there are no replicas, so a couple of shards
>         were
>         > missing. Using the cluster API, servers 1,2,3 think they
>         form a
>         > cluster. Server 4 thinks the cluster is formed between
>         1,2,4.
>         > Restarting server 4 returned everything back to a green
>         state.
>        
>        
>         How long are you waiting before asking server 4 for the
>         cluster state?
>         It doesn't fail immediately, in case there is just a temporary
>         network
>         outage which it can recover from, but after a while it should
>         recognise
>         that it is no longer part of the cluster.
>        
>         Try running all of the logs through es2unix using the
>         "lifecycle" option
>         https://github.com/elasticsearch/es2unix
>        
>         clint
>        
>        
>        
>         --
>         You received this message because you are subscribed to the
>         Google Groups "elasticsearch" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to elasticsearch
>         +[hidden email].
>         For more options, visit
>         https://groups.google.com/groups/opt_out.
>        
>        
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Ivan Brusic
The version is the same as the first post: 0.20RC1. I have not seen any relevant commits since then. I am far more interested in the failure of the first post and not the second.

Will use the es2unix tomorrow. In the meantime, here are some logs about the second failure. In the last cluster state, you can see that the master is wrong.

Cheers,

Ivan


On Wed, Feb 27, 2013 at 1:14 PM, Clinton Gormley <[hidden email]> wrote:
Hi Ivan

On Wed, 2013-02-27 at 07:38 -0800, Ivan Brusic wrote:
> Waited an hour before asking for its state. In both cases,
> ElasticSearch fails to recognize the correct cluster. Nodes removing
> themselves is ultimately the bigger concern, but harder to debug.

We really need more information here. What version of ES are you using?
Did you manage to run the logs through 'lifecycle' on es2unix?

clint

>
>
> On Wed, Feb 27, 2013 at 2:19 AM, Clinton Gormley
> <[hidden email]> wrote:
>         On Tue, 2013-02-26 at 14:59 -0800, Ivan Brusic wrote:
>         > More split-brain weirdness on another cluster. Four node
>         cluster, no
>         > replicas on the indices. Cluster is in a red-state because
>         one node
>         > dropped out and there are no replicas, so a couple of shards
>         were
>         > missing. Using the cluster API, servers 1,2,3 think they
>         form a
>         > cluster. Server 4 thinks the cluster is formed between
>         1,2,4.
>         > Restarting server 4 returned everything back to a green
>         state.
>
>
>         How long are you waiting before asking server 4 for the
>         cluster state?
>         It doesn't fail immediately, in case there is just a temporary
>         network
>         outage which it can recover from, but after a while it should
>         recognise
>         that it is no longer part of the cluster.
>
>         Try running all of the logs through es2unix using the
>         "lifecycle" option
>         https://github.com/elasticsearch/es2unix
>
>         clint
>
>
>
>         --
>         You received this message because you are subscribed to the
>         Google Groups "elasticsearch" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to elasticsearch
>         +[hidden email].
>         For more options, visit
>         https://groups.google.com/groups/opt_out.
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Ivan Brusic
In reply to this post by Clinton Gormley-2
Here is the output from the second cluster. Note that the numbering is slightly different (+2)

search3 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 13:32:34,869 search3 REMOVE search5
2013-02-26 14:52:57,046 search3 ADD    search6

search4 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 13:32:34,878 search4 REMOVE search5
2013-02-26 14:52:57,043 search4 ADD    search6

search5 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 14:52:57,041 search5 ADD search6

search6 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 13:32:34,889 search6 REMOVE search5
2013-02-26 14:52:09,542 search6 STOP
2013-02-26 14:52:48,637 search6 INIT   0.20.0.RC1
2013-02-26 14:52:53,754 search6 BIND   <ipaddress>:9300
2013-02-26 14:52:57,083 search6 MASTER search5
2013-02-26 14:52:57,199 search6 START



On Wed, Feb 27, 2013 at 1:14 PM, Clinton Gormley <[hidden email]> wrote:
Hi Ivan

On Wed, 2013-02-27 at 07:38 -0800, Ivan Brusic wrote:
> Waited an hour before asking for its state. In both cases,
> ElasticSearch fails to recognize the correct cluster. Nodes removing
> themselves is ultimately the bigger concern, but harder to debug.

We really need more information here. What version of ES are you using?
Did you manage to run the logs through 'lifecycle' on es2unix?

clint

>
>
> On Wed, Feb 27, 2013 at 2:19 AM, Clinton Gormley
> <[hidden email]> wrote:
>         On Tue, 2013-02-26 at 14:59 -0800, Ivan Brusic wrote:
>         > More split-brain weirdness on another cluster. Four node
>         cluster, no
>         > replicas on the indices. Cluster is in a red-state because
>         one node
>         > dropped out and there are no replicas, so a couple of shards
>         were
>         > missing. Using the cluster API, servers 1,2,3 think they
>         form a
>         > cluster. Server 4 thinks the cluster is formed between
>         1,2,4.
>         > Restarting server 4 returned everything back to a green
>         state.
>
>
>         How long are you waiting before asking server 4 for the
>         cluster state?
>         It doesn't fail immediately, in case there is just a temporary
>         network
>         outage which it can recover from, but after a while it should
>         recognise
>         that it is no longer part of the cluster.
>
>         Try running all of the logs through es2unix using the
>         "lifecycle" option
>         https://github.com/elasticsearch/es2unix
>
>         clint
>
>
>
>         --
>         You received this message because you are subscribed to the
>         Google Groups "elasticsearch" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to elasticsearch
>         +[hidden email].
>         For more options, visit
>         https://groups.google.com/groups/opt_out.
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Ivan Brusic
Output for the original post, first cluster: https://gist.github.com/brusic/a80e414efec0bee2e3c7


On Thu, Feb 28, 2013 at 10:32 AM, Ivan Brusic <[hidden email]> wrote:
Here is the output from the second cluster. Note that the numbering is slightly different (+2)

search3 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 13:32:34,869 search3 REMOVE search5
2013-02-26 14:52:57,046 search3 ADD    search6

search4 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 13:32:34,878 search4 REMOVE search5
2013-02-26 14:52:57,043 search4 ADD    search6

search5 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 14:52:57,041 search5 ADD search6

search6 ~]$ es lifecycle /var/elasticsearch/logs/ESCluster.log.2013-02-26
2013-02-26 13:32:34,889 search6 REMOVE search5
2013-02-26 14:52:09,542 search6 STOP
2013-02-26 14:52:48,637 search6 INIT   0.20.0.RC1
2013-02-26 14:52:53,754 search6 BIND   <ipaddress>:9300
2013-02-26 14:52:57,083 search6 MASTER search5
2013-02-26 14:52:57,199 search6 START



On Wed, Feb 27, 2013 at 1:14 PM, Clinton Gormley <[hidden email]> wrote:
Hi Ivan

On Wed, 2013-02-27 at 07:38 -0800, Ivan Brusic wrote:
> Waited an hour before asking for its state. In both cases,
> ElasticSearch fails to recognize the correct cluster. Nodes removing
> themselves is ultimately the bigger concern, but harder to debug.

We really need more information here. What version of ES are you using?
Did you manage to run the logs through 'lifecycle' on es2unix?

clint

>
>
> On Wed, Feb 27, 2013 at 2:19 AM, Clinton Gormley
> <[hidden email]> wrote:
>         On Tue, 2013-02-26 at 14:59 -0800, Ivan Brusic wrote:
>         > More split-brain weirdness on another cluster. Four node
>         cluster, no
>         > replicas on the indices. Cluster is in a red-state because
>         one node
>         > dropped out and there are no replicas, so a couple of shards
>         were
>         > missing. Using the cluster API, servers 1,2,3 think they
>         form a
>         > cluster. Server 4 thinks the cluster is formed between
>         1,2,4.
>         > Restarting server 4 returned everything back to a green
>         state.
>
>
>         How long are you waiting before asking server 4 for the
>         cluster state?
>         It doesn't fail immediately, in case there is just a temporary
>         network
>         outage which it can recover from, but after a while it should
>         recognise
>         that it is no longer part of the cluster.
>
>         Try running all of the logs through es2unix using the
>         "lifecycle" option
>         https://github.com/elasticsearch/es2unix
>
>         clint
>
>
>
>         --
>         You received this message because you are subscribed to the
>         Google Groups "elasticsearch" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to elasticsearch
>         +[hidden email].
>         For more options, visit
>         https://groups.google.com/groups/opt_out.
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.




--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Drew Raines-2
In reply to this post by Ivan Brusic
Ivan Brusic wrote:

> Here is the output from the second cluster. Note that the numbering is
> slightly different (+2)

Just FYI, the purpose of the lifecycle command is to interleave logs
to replay the timeline of nodes coming and going.  If you supply the
logfiles as arguments to the single command, you should see something
like:

  % es lifecycle \
  <(ssh search3 cat /var/elasticsearch/logs/ESCluster.log.2013-02-26) \
  <(ssh search4 cat /var/elasticsearch/logs/ESCluster.log.2013-02-26) \
  <(ssh search5 cat /var/elasticsearch/logs/ESCluster.log.2013-02-26) \
  <(ssh search6 cat /var/elasticsearch/logs/ESCluster.log.2013-02-26)
  2013-02-26 13:32:34,869 search3 REMOVE search5
  2013-02-26 13:32:34,878 search4 REMOVE search5
  2013-02-26 13:32:34,889 search6 REMOVE search5
  2013-02-26 14:52:09,542 search6 STOP
  2013-02-26 14:52:48,637 search6 INIT   0.20.0.RC1
  2013-02-26 14:52:53,754 search6 BIND   <ipaddress>:9300
  2013-02-26 14:52:57,041 search5 ADD    search6
  2013-02-26 14:52:57,043 search4 ADD    search6
  2013-02-26 14:52:57,046 search3 ADD    search6
  2013-02-26 14:52:57,083 search6 MASTER search5
  2013-02-26 14:52:57,199 search6 START

Preferably you'd do it over a few days worth of logs to get a real
picture of when things occurred.  Perhaps not in this case, but for a
long and complex history this is much easier to parse.

-Drew

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Clinton Gormley-2
In reply to this post by Ivan Brusic
Hiya

On Wed, 2013-02-27 at 23:47 -0800, Ivan Brusic wrote:
> The version is the same as the first post: 0.20RC1.

Ah, missed that.

> I have not seen any relevant commits since then. I am far more
> interested in the failure of the first post and not the second.

What about this one?
https://github.com/elasticsearch/elasticsearch/issues/2592

I suggest upgrading to 0.20.5

clint



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Nodes fail to join cluster - potential split brain scenario

Ivan Brusic

On Fri, Mar 1, 2013 at 5:17 AM, Clinton Gormley <[hidden email]> wrote:

Hey,

Thanks Drew. Didn't have a chance to explore the tool yet. What I would love to see is master election.

> I have not seen any relevant commits since then. I am far more
> interested in the failure of the first post and not the second.

What about this one?
https://github.com/elasticsearch/elasticsearch/issues/2592

That issue does not address the split brain scenario or why a node disconnected in the first place. These two issues are relevant:
 

I suggest upgrading to 0.20.5

The issues above have not been address and I do not see anything else related to cluster management in the commits/issues. A rolling upgrade is possible, so I probably will upgrade anyways, but I fear that the issues will remain.

Cheers,

Ivan 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Loading...