Quantcast

Split brain due to 'on the fence' network partition

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Split brain due to 'on the fence' network partition

Mark Tinsley
Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
  1. To start with all nodes are connected in the cluster. A is the master.
  2. For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service          ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
    B:
    [2013-11-17 20:25:27,543][INFO ][discovery.ec2            ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]], reason [failed to ping, tried [3] times, each with  maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service          ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service          ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.


Kind Regards,

Mark Tinsley


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Split brain due to 'on the fence' network partition

dadoonet
I think you should open an issue in elasticsearch project with that excellent description you wrote.
Don't know how it could be fixed though.

-- 
David Pilato | Technical Advocate | Elasticsearch.com


Le 20 novembre 2013 at 10:52:11, Mark Tinsley ([hidden email]) a écrit:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
  1. To start with all nodes are connected in the cluster. A is the master.
  2. For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service          ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
    B:
    [2013-11-17 20:25:27,543][INFO ][discovery.ec2            ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]], reason [failed to ping, tried [3] times, each with  maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service          ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service          ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.


Kind Regards,

Mark Tinsley


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Split brain due to 'on the fence' network partition

Leonardo Menezes
this issue is already reported here: https://github.com/elasticsearch/elasticsearch/issues/2488

no solution though.

On Wed, Nov 20, 2013 at 4:36 PM, David Pilato <[hidden email]> wrote:
I think you should open an issue in elasticsearch project with that excellent description you wrote.
Don't know how it could be fixed though.

-- 
David Pilato | Technical Advocate | Elasticsearch.com


Le 20 novembre 2013 at 10:52:11, Mark Tinsley ([hidden email]) a écrit:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
  1. To start with all nodes are connected in the cluster. A is the master.
  2. For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service          ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
    B:
    [2013-11-17 20:25:27,543][INFO ][discovery.ec2            ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]], reason [failed to ping, tried [3] times, each with  maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service          ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service          ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.


Kind Regards,

Mark Tinsley


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Split brain due to 'on the fence' network partition

dadoonet
Ha thanks Leonardo!

-- 
David Pilato | Technical Advocate | Elasticsearch.com


Le 20 novembre 2013 at 16:50:50, Leonardo Menezes ([hidden email]) a écrit:

this issue is already reported here: https://github.com/elasticsearch/elasticsearch/issues/2488

no solution though.

On Wed, Nov 20, 2013 at 4:36 PM, David Pilato <[hidden email]> wrote:
I think you should open an issue in elasticsearch project with that excellent description you wrote.
Don't know how it could be fixed though.

-- 
David Pilato | Technical Advocate | Elasticsearch.com


Le 20 novembre 2013 at 10:52:11, Mark Tinsley ([hidden email]) a écrit:

Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
  1. To start with all nodes are connected in the cluster. A is the master.
  2. For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service          ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
    B:
    [2013-11-17 20:25:27,543][INFO ][discovery.ec2            ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]], reason [failed to ping, tried [3] times, each with  maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service          ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service          ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.


Kind Regards,

Mark Tinsley


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Split brain due to 'on the fence' network partition

Mark Tinsley
In reply to this post by Mark Tinsley
Thanks for the replies, I'll take a look at elasticsearch-zookeeper solution

Cheers,

On Wednesday, November 20, 2013 9:52:07 AM UTC, Mark Tinsley wrote:
Hi all,

I have been having some strange occurrences using elasticsearch on aws.

The setup is three nodes each with the setting of:
cluster.name: <clustername>
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled : false
discovery.type : ec2
discovery.ec2.ping_timeout : 30s
discovery.ec2.groups: <group>
cloud.aws.region : <region>
action.disable_delete_all_indices : true
discovery.zen.minimum_master_nodes : 2

I have witnessed two occurrences of the following:
Given 3 nodes A, B, C. Which are all in the same availability zone.
  1. To start with all nodes are connected in the cluster. A is the master.
  2. For some reason, node A and B cannot talk to each other. but both can still talk to C and C can talk to A and B i.e. a 'on the fence' network partition as C can still see all:
    A:[2013-11-17 20:23:28,257][INFO ][cluster.service          ] [A] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-node_failed([B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]]), reason failed to ping, tried [3] times, each with maximum [30s] timeout
    B:
    [2013-11-17 20:25:27,543][INFO ][discovery.ec2            ] [B] master_left [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]], reason [failed to ping, tried [3] times, each with  maximum [30s] timeout]
    [2013-11-17 20:25:27,547][INFO ][cluster.service          ] [B] master {new [B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]], previous [A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]}, removed {[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]],}, reason: zen-disco-master_failed ([A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]])
    C: [2013-11-17 20:23:28,256][INFO ][cluster.service          ] [C] removed {[B][sUv4amcFSdmaDAVDa7bUVg][inet[/<ipaddress>:9300]],}, reason: zen-disco-receive(from master [[A][O25rauSQR7utohD0jg4RQw][inet[/<ipaddress>:9300]]])

As you can see B is now a new master but A has not been removed as a master, because A can still see C so has the minimum master node criteria satisfied.

When I ask B for it's state it responds stating that it is a master with C.

When I ask A for it's state it responds stating that it is a master with C.

When I ask C for it's state it responds with the same cluster state as A.

This can be replicated by setting up three nodes (settings above), then once a master has been established drop the connection between it and what you assume will be the next master (usually the next node in the list after the master). I used the following commands:

On the master node (A): iptables -A INPUT -s <node B ip address> -j DROP

On the next node (B): iptables -A INPUT -s <node A ip address> -j DROP

This should get you in the same state that I have witnessed in aws, once two masters are established remove the iptables entries (running iptables -F on A and B). From what I understand node discovery only happens when a node is starting up or does not belong to a cluster, so as these nodes do belong to a cluster they never discover each other.

I have tried this against versions 0.90.0, 0.90.4, 0.90.7 and 1.0.0.Beta1.zip of elasticsearch with no luck. I was using the elasticsearch-cloud-aws plugin version 1.11.0 for elasticsearch version 0.90.0 and version 1.15.0 for elasticsearch versions 0.90.4, 0.90.7 and 1.0.0.Beta1.

I do not want to have to set minimum master nodes to 3 as for this use case I value availability.

Any help would be greatly appreciated.


Kind Regards,

Mark Tinsley


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Loading...