inconsistent paging

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

inconsistent paging

Ron Sher
Hi,

We've noticed a strange behavior in elasticsearch during paging. 

In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. 

The query we use has an order by some numberic field that has many documents with the same value (0). 
It looks like the ordering between documents according to the same value, which is 0, isn't consistent. 

Did anyone encounter such behavior? Any suggestions on resolving this? 

We're using version 1.3.1. 

Thanks, 
Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: inconsistent paging

dadoonet
You need to use scroll if you have that requirement.


Le 18 août 2014 à 08:02, Ron Sher <[hidden email]> a écrit :

Hi,

We've noticed a strange behavior in elasticsearch during paging. 

In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. 

The query we use has an order by some numberic field that has many documents with the same value (0). 
It looks like the ordering between documents according to the same value, which is 0, isn't consistent. 

Did anyone encounter such behavior? Any suggestions on resolving this? 

We're using version 1.3.1. 

Thanks, 
Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8DAEA97B-687A-44A6-B638-189A49D6310E%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: inconsistent paging

Adrien Grand-2
In reply to this post by Ron Sher
Hi Ron,

The cause of this issue is that Elasticsearch uses Lucene's internal doc IDs as tie-breakers. Internal doc IDs might be completely different across replicas of the same data, so this explains why documents that have the same sort values are not consistently ordered.

There are 2 potential ways to fix that problem:
 1. Use scroll as David mentionned. It will create a context around your request and will make sure that the same shards will be used for all pages. However, it also gives another warranty, which is that the same point-in-time view on the index will be used for each page, and this is expensive to maintain.
 2. Use a custom string value as a preference in order to always hit the same shards for a given session[1]. This will help with always hitting the same shards likely to 1. but without adding the additional cost of a scroll.

[1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html



On Mon, Aug 18, 2014 at 8:02 AM, Ron Sher <[hidden email]> wrote:
Hi,

We've noticed a strange behavior in elasticsearch during paging. 

In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. 

The query we use has an order by some numberic field that has many documents with the same value (0). 
It looks like the ordering between documents according to the same value, which is 0, isn't consistent. 

Did anyone encounter such behavior? Any suggestions on resolving this? 

We're using version 1.3.1. 

Thanks, 
Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.



--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7FJofXSpDjHnpMVs1poHFREbrQ9DPnPX4YnjFjUKg_ng%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: inconsistent paging

Ron Sher
Thanks for the answer and sorry for the duplicate (posted from a different source by mistake)

On Monday, August 18, 2014 11:02:47 AM UTC+3, Adrien Grand wrote:
Hi Ron,

The cause of this issue is that Elasticsearch uses Lucene's internal doc IDs as tie-breakers. Internal doc IDs might be completely different across replicas of the same data, so this explains why documents that have the same sort values are not consistently ordered.

There are 2 potential ways to fix that problem:
 1. Use scroll as David mentionned. It will create a context around your request and will make sure that the same shards will be used for all pages. However, it also gives another warranty, which is that the same point-in-time view on the index will be used for each page, and this is expensive to maintain.
 2. Use a custom string value as a preference in order to always hit the same shards for a given session[1]. This will help with always hitting the same shards likely to 1. but without adding the additional cost of a scroll.

[1] <a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.elasticsearch.org%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Fsearch-request-preference.html\46sa\75D\46sntz\0751\46usg\75AFQjCNEb435Hj12EB74NB-cx9hizwpUIEQ';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.elasticsearch.org%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Fsearch-request-preference.html\46sa\75D\46sntz\0751\46usg\75AFQjCNEb435Hj12EB74NB-cx9hizwpUIEQ';return true;">http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-preference.html



On Mon, Aug 18, 2014 at 8:02 AM, Ron Sher <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="iIEbLKtGvCIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">ron....@...> wrote:
Hi,

We've noticed a strange behavior in elasticsearch during paging. 

In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd page, we sometimes get results that were before in the 1st page. 

The query we use has an order by some numberic field that has many documents with the same value (0). 
It looks like the ordering between documents according to the same value, which is 0, isn't consistent. 

Did anyone encounter such behavior? Any suggestions on resolving this? 

We're using version 1.3.1. 

Thanks, 
Ron

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="iIEbLKtGvCIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAKHuyJpcYKepYzh%2BBU2MSD2RQ19zjHYiXgf3anWBL9esq9fkGQ%40mail.gmail.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.



--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/277ec3ee-f7bf-4862-a816-efe2937a9609%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.