Read/Write consistency

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Read/Write consistency

Mohit Anchlia
Trying to understand the following scenarios of consistency in elasticsearch:
 
1) sync replication - How does elasticsearch deals with consistency issue that may arise from 1 node momentarily going down and missing writes to it? When the node comes backup and the reads going to the non-primary shards could get inconsistent data?
2) async replication - What happens if replication is slow for some reason, could users see inconsistent data?
3) sync/async replication - how does elasticsearch keep data in sync for those writes that never happened on the non-primary shard because of network/node failures?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrpUEpVU2yg3km_v%3DtuA0duSiFV5HYnPyeCztdmrTcMsA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Read/Write consistency

Mohit Anchlia
Could somebody help get some insights on this topic?

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia <[hidden email]> wrote:
Trying to understand the following scenarios of consistency in elasticsearch:
 
1) sync replication - How does elasticsearch deals with consistency issue that may arise from 1 node momentarily going down and missing writes to it? When the node comes backup and the reads going to the non-primary shards could get inconsistent data?
2) async replication - What happens if replication is slow for some reason, could users see inconsistent data?
3) sync/async replication - how does elasticsearch keep data in sync for those writes that never happened on the non-primary shard because of network/node failures?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWrv_zkCQ26dk9Ey41zckaix9QZWP6ObUx4dsYp0p99Bgg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Read/Write consistency

Radu Gheorghe-2
Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia <[hidden email]> wrote:
Trying to understand the following scenarios of consistency in elasticsearch:
 
1) sync replication - How does elasticsearch deals with consistency issue that may arise from 1 node momentarily going down and missing writes to it?

This depends on the write consistency setting. By default, the operation only succeeds if a quorum of replicas can index the document:
 
When the node comes backup and the reads going to the non-primary shards could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the other nodes.
 
2) async replication - What happens if replication is slow for some reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could see an "old" version of the data. You can use "preference" to try and hit the primary shard all the time, but then your replicas will just be sitting there for redundancy:
 
3) sync/async replication - how does elasticsearch keep data in sync for those writes that never happened on the non-primary shard because of network/node failures?

It either uses the transaction log or it transfers the whole shard to that node.

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Read/Write consistency

Mohit Anchlia
What's not clear is how does elasticsearch identify what pieces of data is missing between the primary and the replica?

On Wed, Apr 30, 2014 at 3:27 AM, Radu Gheorghe <[hidden email]> wrote:
Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia <[hidden email]> wrote:
Trying to understand the following scenarios of consistency in elasticsearch:
 
1) sync replication - How does elasticsearch deals with consistency issue that may arise from 1 node momentarily going down and missing writes to it?

This depends on the write consistency setting. By default, the operation only succeeds if a quorum of replicas can index the document:
 
When the node comes backup and the reads going to the non-primary shards could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the other nodes.
 
2) async replication - What happens if replication is slow for some reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could see an "old" version of the data. You can use "preference" to try and hit the primary shard all the time, but then your replicas will just be sitting there for redundancy:
 
3) sync/async replication - how does elasticsearch keep data in sync for those writes that never happened on the non-primary shard because of network/node failures?

It either uses the transaction log or it transfers the whole shard to that node.

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Read/Write consistency

Radu Gheorghe-2
Hi Mohit,

I think the transaction log takes care of that, because there's a copy on all instances of the same shard, and they need to be in sync.

Best regards,
Radu

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thu, May 1, 2014 at 9:57 PM, Mohit Anchlia <[hidden email]> wrote:
What's not clear is how does elasticsearch identify what pieces of data is missing between the primary and the replica?

On Wed, Apr 30, 2014 at 3:27 AM, Radu Gheorghe <[hidden email]> wrote:
Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia <[hidden email]> wrote:
Trying to understand the following scenarios of consistency in elasticsearch:
 
1) sync replication - How does elasticsearch deals with consistency issue that may arise from 1 node momentarily going down and missing writes to it?

This depends on the write consistency setting. By default, the operation only succeeds if a quorum of replicas can index the document:
 
When the node comes backup and the reads going to the non-primary shards could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the other nodes.
 
2) async replication - What happens if replication is slow for some reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could see an "old" version of the data. You can use "preference" to try and hit the primary shard all the time, but then your replicas will just be sitting there for redundancy:
 
3) sync/async replication - how does elasticsearch keep data in sync for those writes that never happened on the non-primary shard because of network/node failures?

It either uses the transaction log or it transfers the whole shard to that node.

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3FAEvQGjMWDqSCT6biYJGiMNGSUDJ80QvT1cJXnqtNJg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Read/Write consistency

Mohit Anchlia
Is there a documentation on that? From what I've read it is local to the node.

On Thu, May 1, 2014 at 11:57 PM, Radu Gheorghe <[hidden email]> wrote:
Hi Mohit,

I think the transaction log takes care of that, because there's a copy on all instances of the same shard, and they need to be in sync.

Best regards,
Radu

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Thu, May 1, 2014 at 9:57 PM, Mohit Anchlia <[hidden email]> wrote:
What's not clear is how does elasticsearch identify what pieces of data is missing between the primary and the replica?

On Wed, Apr 30, 2014 at 3:27 AM, Radu Gheorghe <[hidden email]> wrote:
Hi Mohit,

I'll answer inline.

On Mon, Apr 28, 2014 at 4:57 PM, Mohit Anchlia <[hidden email]> wrote:
Trying to understand the following scenarios of consistency in elasticsearch:
 
1) sync replication - How does elasticsearch deals with consistency issue that may arise from 1 node momentarily going down and missing writes to it?

This depends on the write consistency setting. By default, the operation only succeeds if a quorum of replicas can index the document:
 
When the node comes backup and the reads going to the non-primary shards could get inconsistent data?

No, when the node comes back up it will sync the stuff it missed with the other nodes.
 
2) async replication - What happens if replication is slow for some reason, could users see inconsistent data?

Yes, if you hit a shard that didn't get the latest operation, it could see an "old" version of the data. You can use "preference" to try and hit the primary shard all the time, but then your replicas will just be sitting there for redundancy:
 
3) sync/async replication - how does elasticsearch keep data in sync for those writes that never happened on the non-primary shard because of network/node failures?

It either uses the transaction log or it transfers the whole shard to that node.

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3aJ4qZt47uyjqs0gd6L1Fz0EhLrV_L7jzSFAYOEvz1Nw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpdBxiXZgDw5HXdeRPr5oJtnwHTwHNFr2_UoJYobPqzxw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_3FAEvQGjMWDqSCT6biYJGiMNGSUDJ80QvT1cJXnqtNJg%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWphfOYz%3DkFTH-NR6GeAna1oe3kq1je2Dz4iesePAS%3D%2BMA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.