How to fix primary-replica inconsistency?

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

How to fix primary-replica inconsistency?

arta
Hi,
I have a problem as same as described in here:
http://elasticsearch-users.115913.n3.nabble.com/BUG-Alternating-result-set-across-every-query-tt4021027.html

The same search query returns one document, then next time it returns none, and alternates.
If I add preference=_primary_first then I get one document every time.
So I think the primary shard has the document but the replica does not. (I have 1 replica)

My question here is how to fix this problem.
The discussion in the link above does not have a solution.

In addition, is there any config parameter that specifies how often or in what situation primary shards contents are reflected to their replicas?

Thanks for your help.
qjh
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

qjh
I have been seeing this same issue on a few indices.  The primary and replica are divergent and nothing seems to resolve it (they have been refreshed and optimized).  I've since worked around this by having to recreate the index.

Does anyone have a good cluster state dump that could by used to open an issue?  There doesn't appear to be one for this yet:  https://github.com/elasticsearch/elasticsearch/issues

On Thursday, September 13, 2012 5:29:48 PM UTC-4, arta wrote:
Hi,
I have a problem as same as described in here:
http://elasticsearch-users.115913.n3.nabble.com/BUG-Alternating-result-set-across-every-query-tt4021027.html

The same search query returns one document, then next time it returns none,
and alternates.
If I add preference=_primary_first then I get one document every time.
So I think the primary shard has the document but the replica does not. (I
have 1 replica)

My question here is how to fix this problem.
The discussion in the link above does not have a solution.

In addition, is there any config parameter that specifies how often or in
what situation primary shards contents are reflected to their replicas?

Thanks for your help.




--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-fix-primary-replica-inconsistency-tp4022692.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Kurt Harriger
Same issue here and same resolution at the moment.  I just haven't had the time yet to dig deep into this issue, but its something we will need to dig into soon.  At the moment we're limping along with a ?preference=_primary_first to prevent users from seeing the inconsistencies, so our replica shard basically only acts as a hot standby in case the master fails.  We already had code that writes to multiple indices simultaneously that we used to migrate from solr and/or make schema changes, so we just use this code to keep two indices up to date.  This way when we discover an issue with one index we switch to the other index drop the broken one and rebuild it without affecting our users.  

We currently use the _status endpoint to identify if the replicas our out of sync, I tried to write a script to do this and force an index failover and reindex automatically but I wasn't how to determine if the numDocs differed because the replica shard hasn't yet applied all the change sets or if it was actually out of sync. 


On Friday, September 14, 2012 8:59:29 AM UTC-6, qjh wrote:
I have been seeing this same issue on a few indices.  The primary and replica are divergent and nothing seems to resolve it (they have been refreshed and optimized).  I've since worked around this by having to recreate the index.

Does anyone have a good cluster state dump that could by used to open an issue?  There doesn't appear to be one for this yet:  https://github.com/elasticsearch/elasticsearch/issues

On Thursday, September 13, 2012 5:29:48 PM UTC-4, arta wrote:
Hi,
I have a problem as same as described in here:
http://elasticsearch-users.115913.n3.nabble.com/BUG-Alternating-result-set-across-every-query-tt4021027.html

The same search query returns one document, then next time it returns none,
and alternates.
If I add preference=_primary_first then I get one document every time.
So I think the primary shard has the document but the replica does not. (I
have 1 replica)

My question here is how to fix this problem.
The discussion in the link above does not have a solution.

In addition, is there any config parameter that specifies how often or in
what situation primary shards contents are reflected to their replicas?

Thanks for your help.




--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-fix-primary-replica-inconsistency-tp4022692.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

arta
Thanks for the reply, qjh, Kurt.
Seems like there is no way other than reindexing to fix the problem.

Kurt mentioned about numDocs, I suppose that means we can use curl with _count to determine whether there is inconsistency between shards.

I have millions of documents.
If there is a way to find out which documents are only in primary or only in replica, that information makes the reindexing a lot efficient.
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

arta
Kurt,
Can you please elaborate "We currently use the _status endpoint to identify if the replicas our out of sync" a little more?
So you compare each shard's num_docs?

As you mentioned, while indexing is running on, it will be difficult to to distinguish the cause of the number difference, i.e. by inconsistency or by replication delay.
What is your strategy to automatically discover the inconsistency?

Thanks again for your help!
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Kurt Harriger
Yep, I basically just look to see how different the num_docs is.  There is also a translog/id which I would assume that if master and replica shard have the same translog/id then they should have the same num_docs.  I was thinking perhaps writing a script that looks for any shards that have same translog/id but different num_docs, but for now I just check it manually.  In any event here is a typical response I get back that appears out of sync to me.

{
  • ok: true,
  • _shards:
    {
    • total: 20,
    • successful: 20,
    • failed: 0
    },
  • indices:
    {
    • default2:
      {
      • index:
        {
        • primary_size: "3.5gb",
        • primary_size_in_bytes: 3836722124,
        • size: "7gb",
        • size_in_bytes: 7606933703
        },
      • translog:
        {
        • operations: 1621
        },
      • docs:
        {
        • num_docs: 3065297,
        • max_doc: 3462692,
        • deleted_docs: 397395
        },
      • merges:
        {
        • current: 0,
        • current_docs: 0,
        • current_size: "0b",
        • current_size_in_bytes: 0,
        • total: 43004,
        • total_time: "3h",
        • total_time_in_millis: 10957007,
        • total_docs: 71491119,
        • total_size: "79.9gb",
        • total_size_in_bytes: 85824354912
        },
      • refresh:
        {
        • total: 226327,
        • total_time: "1.1h",
        • total_time_in_millis: 4198337
        },
      • flush:
        {
        • total: 3828,
        • total_time: "59.9m",
        • total_time_in_millis: 3596409
        },
      • shards:
        {
        • 0:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 0,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "373.5mb",
              • size_in_bytes: 391702523
              },
            • translog:
              {
              • id: 1347901831292,
              • operations: 74
              },
            • docs:
              {
              • num_docs: 307291,
              • max_doc: 353087,
              • deleted_docs: 45796
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2143,
              • total_time: "12.8m",
              • total_time_in_millis: 770853,
              • total_docs: 3179382,
              • total_size: "3.5gb",
              • total_size_in_bytes: 3843614285
              },
            • refresh:
              {
              • total: 11204,
              • total_time: "5.2m",
              • total_time_in_millis: 317926
              },
            • flush:
              {
              • total: 192,
              • total_time: "4.7m",
              • total_time_in_millis: 283969
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 0,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "362.3mb",
              • size_in_bytes: 379899395
              },
            • translog:
              {
              • id: 1347901831292,
              • operations: 74
              },
            • docs:
              {
              • num_docs: 303379,
              • max_doc: 343611,
              • deleted_docs: 40232
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2173,
              • total_time: "5.4m",
              • total_time_in_millis: 329683,
              • total_docs: 3825448,
              • total_size: "4.2gb",
              • total_size_in_bytes: 4573205157
              },
            • refresh:
              {
              • total: 11583,
              • total_time: "1.7m",
              • total_time_in_millis: 107880
              },
            • flush:
              {
              • total: 192,
              • total_time: "1.7m",
              • total_time_in_millis: 107054
              }
            }
          ],
        • 1:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 1,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "363.6mb",
              • size_in_bytes: 381298886
              },
            • translog:
              {
              • id: 1347901831415,
              • operations: 96
              },
            • docs:
              {
              • num_docs: 306295,
              • max_doc: 344589,
              • deleted_docs: 38294
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2148,
              • total_time: "13.5m",
              • total_time_in_millis: 815476,
              • total_docs: 3661059,
              • total_size: "4.1gb",
              • total_size_in_bytes: 4447223751
              },
            • refresh:
              {
              • total: 11265,
              • total_time: "5.2m",
              • total_time_in_millis: 312026
              },
            • flush:
              {
              • total: 191,
              • total_time: "4.3m",
              • total_time_in_millis: 258892
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 1,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "378.3mb",
              • size_in_bytes: 396779935
              },
            • translog:
              {
              • id: 1347901831415,
              • operations: 96
              },
            • docs:
              {
              • num_docs: 302098,
              • max_doc: 356028,
              • deleted_docs: 53930
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2168,
              • total_time: "4.9m",
              • total_time_in_millis: 297319,
              • total_docs: 3164854,
              • total_size: "3.5gb",
              • total_size_in_bytes: 3803854038
              },
            • refresh:
              {
              • total: 11656,
              • total_time: "1.7m",
              • total_time_in_millis: 107995
              },
            • flush:
              {
              • total: 191,
              • total_time: "1.8m",
              • total_time_in_millis: 109000
              }
            }
          ],
        • 2:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 2,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "341.9mb",
              • size_in_bytes: 358525806
              },
            • translog:
              {
              • id: 1347901772812,
              • operations: 90
              },
            • docs:
              {
              • num_docs: 305291,
              • max_doc: 328203,
              • deleted_docs: 22912
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2138,
              • total_time: "12.3m",
              • total_time_in_millis: 741442,
              • total_docs: 3817398,
              • total_size: "4.2gb",
              • total_size_in_bytes: 4560351643
              },
            • refresh:
              {
              • total: 11297,
              • total_time: "5.1m",
              • total_time_in_millis: 311758
              },
            • flush:
              {
              • total: 191,
              • total_time: "3.4m",
              • total_time_in_millis: 208984
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 2,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "355.3mb",
              • size_in_bytes: 372645152
              },
            • translog:
              {
              • id: 1347901772813,
              • operations: 90
              },
            • docs:
              {
              • num_docs: 306395,
              • max_doc: 339658,
              • deleted_docs: 33263
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2186,
              • total_time: "5.1m",
              • total_time_in_millis: 310026,
              • total_docs: 3408416,
              • total_size: "3.8gb",
              • total_size_in_bytes: 4086555557
              },
            • refresh:
              {
              • total: 11587,
              • total_time: "1.7m",
              • total_time_in_millis: 102844
              },
            • flush:
              {
              • total: 192,
              • total_time: "1.6m",
              • total_time_in_millis: 96582
              }
            }
          ],
        • 3:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 3,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "401.3mb",
              • size_in_bytes: 420864740
              },
            • translog:
              {
              • id: 1347901772798,
              • operations: 67
              },
            • docs:
              {
              • num_docs: 305251,
              • max_doc: 375754,
              • deleted_docs: 70503
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2124,
              • total_time: "12.4m",
              • total_time_in_millis: 747821,
              • total_docs: 3515161,
              • total_size: "3.9gb",
              • total_size_in_bytes: 4238118231
              },
            • refresh:
              {
              • total: 11146,
              • total_time: "5.2m",
              • total_time_in_millis: 317066
              },
            • flush:
              {
              • total: 191,
              • total_time: "3.9m",
              • total_time_in_millis: 235077
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 3,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "404.1mb",
              • size_in_bytes: 423736704
              },
            • translog:
              {
              • id: 1347901772799,
              • operations: 69
              },
            • docs:
              {
              • num_docs: 306550,
              • max_doc: 377493,
              • deleted_docs: 70943
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2175,
              • total_time: "5.2m",
              • total_time_in_millis: 316727,
              • total_docs: 3449317,
              • total_size: "3.8gb",
              • total_size_in_bytes: 4152370061
              },
            • refresh:
              {
              • total: 11438,
              • total_time: "1.8m",
              • total_time_in_millis: 109234
              },
            • flush:
              {
              • total: 192,
              • total_time: "2.1m",
              • total_time_in_millis: 126846
              }
            }
          ],
        • 4:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 4,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "381.6mb",
              • size_in_bytes: 400202939
              },
            • translog:
              {
              • id: 1347901831289,
              • operations: 98
              },
            • docs:
              {
              • num_docs: 305897,
              • max_doc: 359662,
              • deleted_docs: 53765
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2138,
              • total_time: "12.4m",
              • total_time_in_millis: 748422,
              • total_docs: 3747681,
              • total_size: "4.1gb",
              • total_size_in_bytes: 4482284498
              },
            • refresh:
              {
              • total: 11228,
              • total_time: "5.3m",
              • total_time_in_millis: 319209
              },
            • flush:
              {
              • total: 190,
              • total_time: "4m",
              • total_time_in_millis: 243675
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 4,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "350.3mb",
              • size_in_bytes: 367367654
              },
            • translog:
              {
              • id: 1347901831290,
              • operations: 97
              },
            • docs:
              {
              • num_docs: 302085,
              • max_doc: 331367,
              • deleted_docs: 29282
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2160,
              • total_time: "5.4m",
              • total_time_in_millis: 325400,
              • total_docs: 3663609,
              • total_size: "4gb",
              • total_size_in_bytes: 4395019498
              },
            • refresh:
              {
              • total: 11551,
              • total_time: "1.8m",
              • total_time_in_millis: 108807
              },
            • flush:
              {
              • total: 191,
              • total_time: "2.1m",
              • total_time_in_millis: 128884
              }
            }
          ],
        • 5:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 5,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "362.2mb",
              • size_in_bytes: 379848618
              },
            • translog:
              {
              • id: 1347901831423,
              • operations: 82
              },
            • docs:
              {
              • num_docs: 306784,
              • max_doc: 340679,
              • deleted_docs: 33895
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2125,
              • total_time: "13.2m",
              • total_time_in_millis: 795503,
              • total_docs: 3530900,
              • total_size: "3.9gb",
              • total_size_in_bytes: 4257253724
              },
            • refresh:
              {
              • total: 11088,
              • total_time: "5m",
              • total_time_in_millis: 302650
              },
            • flush:
              {
              • total: 192,
              • total_time: "4.6m",
              • total_time_in_millis: 277611
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 5,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "344.1mb",
              • size_in_bytes: 360850766
              },
            • translog:
              {
              • id: 1347901831423,
              • operations: 80
              },
            • docs:
              {
              • num_docs: 303711,
              • max_doc: 325543,
              • deleted_docs: 21832
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2154,
              • total_time: "5.6m",
              • total_time_in_millis: 337964,
              • total_docs: 3825832,
              • total_size: "4.2gb",
              • total_size_in_bytes: 4591861598
              },
            • refresh:
              {
              • total: 11433,
              • total_time: "1.7m",
              • total_time_in_millis: 104688
              },
            • flush:
              {
              • total: 192,
              • total_time: "1.5m",
              • total_time_in_millis: 91405
              }
            }
          ],
        • 6:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 6,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "361.9mb",
              • size_in_bytes: 379558406
              },
            • translog:
              {
              • id: 1347901831378,
              • operations: 54
              },
            • docs:
              {
              • num_docs: 306499,
              • max_doc: 343743,
              • deleted_docs: 37244
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2128,
              • total_time: "13.2m",
              • total_time_in_millis: 796938,
              • total_docs: 3601219,
              • total_size: "4gb",
              • total_size_in_bytes: 4305769584
              },
            • refresh:
              {
              • total: 11016,
              • total_time: "5m",
              • total_time_in_millis: 303480
              },
            • flush:
              {
              • total: 192,
              • total_time: "4.7m",
              • total_time_in_millis: 286890
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 6,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "391.3mb",
              • size_in_bytes: 410343610
              },
            • translog:
              {
              • id: 1347901831378,
              • operations: 54
              },
            • docs:
              {
              • num_docs: 302809,
              • max_doc: 366089,
              • deleted_docs: 63280
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2144,
              • total_time: "5.4m",
              • total_time_in_millis: 324305,
              • total_docs: 3584721,
              • total_size: "4gb",
              • total_size_in_bytes: 4300272618
              },
            • refresh:
              {
              • total: 11354,
              • total_time: "1.8m",
              • total_time_in_millis: 112237
              },
            • flush:
              {
              • total: 192,
              • total_time: "1.9m",
              • total_time_in_millis: 115561
              }
            }
          ],
        • 7:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 7,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "358.4mb",
              • size_in_bytes: 375896103
              },
            • translog:
              {
              • id: 1347901831301,
              • operations: 74
              },
            • docs:
              {
              • num_docs: 306144,
              • max_doc: 337143,
              • deleted_docs: 30999
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2125,
              • total_time: "13.2m",
              • total_time_in_millis: 796752,
              • total_docs: 3564015,
              • total_size: "3.9gb",
              • total_size_in_bytes: 4286994130
              },
            • refresh:
              {
              • total: 11048,
              • total_time: "5.2m",
              • total_time_in_millis: 316874
              },
            • flush:
              {
              • total: 191,
              • total_time: "3.9m",
              • total_time_in_millis: 236884
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 7,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "344.2mb",
              • size_in_bytes: 360935272
              },
            • translog:
              {
              • id: 1347901831301,
              • operations: 73
              },
            • docs:
              {
              • num_docs: 302416,
              • max_doc: 327545,
              • deleted_docs: 25129
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2163,
              • total_time: "5.6m",
              • total_time_in_millis: 336263,
              • total_docs: 3810258,
              • total_size: "4.2gb",
              • total_size_in_bytes: 4565132999
              },
            • refresh:
              {
              • total: 11416,
              • total_time: "1.7m",
              • total_time_in_millis: 105342
              },
            • flush:
              {
              • total: 191,
              • total_time: "1.3m",
              • total_time_in_millis: 80964
              }
            }
          ],
        • 8:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 8,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "330.7mb",
              • size_in_bytes: 346790325
              },
            • translog:
              {
              • id: 1347901772804,
              • operations: 66
              },
            • docs:
              {
              • num_docs: 305785,
              • max_doc: 315564,
              • deleted_docs: 9779
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2116,
              • total_time: "13.2m",
              • total_time_in_millis: 796557,
              • total_docs: 3818340,
              • total_size: "4.2gb",
              • total_size_in_bytes: 4556889217
              },
            • refresh:
              {
              • total: 11042,
              • total_time: "5m",
              • total_time_in_millis: 305252
              },
            • flush:
              {
              • total: 191,
              • total_time: "3.5m",
              • total_time_in_millis: 215837
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 8,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "349.5mb",
              • size_in_bytes: 366535393
              },
            • translog:
              {
              • id: 1347901772805,
              • operations: 64
              },
            • docs:
              {
              • num_docs: 306854,
              • max_doc: 334557,
              • deleted_docs: 27703
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2175,
              • total_time: "4.9m",
              • total_time_in_millis: 299631,
              • total_docs: 3358427,
              • total_size: "3.7gb",
              • total_size_in_bytes: 4019011835
              },
            • refresh:
              {
              • total: 11362,
              • total_time: "1.7m",
              • total_time_in_millis: 102257
              },
            • flush:
              {
              • total: 192,
              • total_time: "1.8m",
              • total_time_in_millis: 109720
              }
            }
          ],
        • 9:
          [
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: false,
              • node: "95gp_xKVRra472UxiDygiA",
              • relocating_node: null,
              • shard: 9,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "350.8mb",
              • size_in_bytes: 367854076
              },
            • translog:
              {
              • id: 1347901772820,
              • operations: 111
              },
            • docs:
              {
              • num_docs: 305293,
              • max_doc: 333894,
              • deleted_docs: 28601
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2130,
              • total_time: "12.3m",
              • total_time_in_millis: 740824,
              • total_docs: 3148931,
              • total_size: "3.5gb",
              • total_size_in_bytes: 3773382780
              },
            • refresh:
              {
              • total: 11139,
              • total_time: "5.5m",
              • total_time_in_millis: 330083
              },
            • flush:
              {
              • total: 191,
              • total_time: "4.1m",
              • total_time_in_millis: 247113
              }
            },
          • {
            • routing:
              {
              • state: "STARTED",
              • primary: true,
              • node: "tdhHCSEBSaKLmsaE_E4Gzw",
              • relocating_node: null,
              • shard: 9,
              • index: "default2"
              },
            • state: "STARTED",
            • index:
              {
              • size: "348.3mb",
              • size_in_bytes: 365297400
              },
            • translog:
              {
              • id: 1347901772820,
              • operations: 112
              },
            • docs:
              {
              • num_docs: 306588,
              • max_doc: 332081,
              • deleted_docs: 25493
              },
            • merges:
              {
              • current: 0,
              • current_docs: 0,
              • current_size: "0b",
              • current_size_in_bytes: 0,
              • total: 2191,
              • total_time: "5.4m",
              • total_time_in_millis: 329101,
              • total_docs: 3816151,
              • total_size: "4.2gb",
              • total_size_in_bytes: 4585189708
              },
            • refresh:
              {
              • total: 11474,
              • total_time: "1.6m",
              • total_time_in_millis: 100729
              },
            • flush:
              {
              • total: 191,
              • total_time: "2.2m",
              • total_time_in_millis: 135461
              }
            }
          ]
        }
      }
    }
}

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

arta
Thanks, again, Kurt,
I looked at _status output and I think what I found was inconsistency disappears over time.
I wrote a simple ruby script to process _status?pretty=true output (see below)
I took a couple of sample data and saw many shards have different translog id but the num_docs seem ok.
What does translog id actually mean?
What is your concern if replicas have different translog id?

One more question for anybody knows ES well:
Some document seemed to take hours until it appeared in the replica (or maybe viceversa).
What brings this time lag?

----------------------------------------- here's my script --------------
if (ARGV.length < 1)
        puts $PROGRAM_NAME + ' <input file>'
        exit
end

@mode = :waiting_index
@info = {}

open(ARGV[0], 'r') {|f|
        f.each_line {|line|
                case @mode
                when :waiting_index
                        if (line =~ /"(i[0-9]+)" : /)
                                @index = $1
                                @info[@index] = {}
                                @mode = :waiting_shard_no
                        end
                when :waiting_shard_no
                        if (line =~ /"([0-9]+)" : \[/)
                                @shard = $1
                                @info[@index][@shard] = {}
                                @mode = :handling_shard
                                @subshard = 0
                                @info[@index][@shard][@subshard] = {}
                        end
                when :handling_shard
                        if (line =~ /"primary" : (\w+),/)
                                @info[@index][@shard][@subshard][:primary] = $1
                        elsif (line =~ /"node" : "(\S+)",/)
                                @info[@index][@shard][@subshard][:node] = $1
                        elsif (line =~ /"id" : ([0-9]+)/)
                                @info[@index][@shard][@subshard][:translog_id] = $1
                        elsif (line =~ /"num_docs" : ([0-9]+)/)
                                @info[@index][@shard][@subshard][:num_docs] = $1
                        elsif (line =~ /^\s+\}, \{\s?$/)
                                @subshard += 1
                                @info[@index][@shard][@subshard] = {}
                        elsif (line =~ /^\s+\} \],\s?$/)
                                @mode = :waiting_shard_no
                        elsif (line =~ /^\s+\} \]\s?$/)
                                @mode = :waiting_index
                        end
                end
        }
}

def dump_info(info)
        "tranalog=#{info[:translog_id]} num_docs=#{info[:num_docs]} node=#{info[:node]}" +
        (info[:primary] == "true" ? " primary" : "        ")
end

def dump_diff(idx, shard, i, info, ref_i, ref_info)
        idxShard = "#{idx}-#{shard}"
        "#{idxShard} [#{ref_i}]: #{dump_info(ref_info)}\n" +
        " " * idxShard.length + " [#{i}]: #{dump_info(info)}"
end

def sort_keys(col)
        col.keys.sort {|a,b|
                if (a.class != String || a.length == b.length)
                        a <=> b
                else
                        a.length <=> b.length
                end
        }
end

sort_keys(@info).each {|idx| idx_info = @info[idx]
        sort_keys(idx_info).each {|shard| shard_info = idx_info[shard]
                translogs = []
                sort_keys(shard_info).each {|i| info = shard_info[i]
                        if (translogs.empty?)
                                translogs << { i => info }
                        else
                                if (info[:num_docs] != translogs[0].values[0][:num_docs])
                                        puts dump_diff(idx, shard, i, info, translogs[0].keys[0], translogs[0].values[0]) + " DIFFERENT COUNT"
                                elsif (info[:translog_id] != translogs[0].values[0][:translog_id])
                                        puts dump_diff(idx, shard, i, info, translogs[0].keys[0], translogs[0].values[0])
                                end
                        end
                }
        }
}
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Kurt Harriger
I assume that the translog/id represents the position in the transaction log.  If the replica does not have the same id then I would assume it is still catching up to the master and one would expect that the num_docs may differ as the replica is still replaying changes.  However, if they are at the same position in the transaction log and the num_docs is different then I would assume that the replica should have the same num_docs as the master and if not then why not… that is the question.  I still haven't had any time to dig into the elastic search source code so that is just my assumptions, perhaps a committer could clarify. 

-- 
Kurt Harriger
Sent with Sparrow

On Thursday, September 20, 2012 at 11:05 AM, arta wrote:

Thanks, again, Kurt,
I looked at _status output and I think what I found was inconsistency
disappears over time.
I wrote a simple ruby script to process _status?pretty=true output (see
below)
I took a couple of sample data and saw many shards have different translog
id but the num_docs seem ok.
What does translog id actually mean?
What is your concern if replicas have different translog id?

One more question for anybody knows ES well:
Some document seemed to take hours until it appeared in the replica (or
maybe viceversa).
What brings this time lag?

----------------------------------------- here's my script --------------
if (ARGV.length < 1)
puts $PROGRAM_NAME + ' <input file>'
exit
end

@mode = :waiting_index
@info = {}

open(ARGV[0], 'r') {|f|
f.each_line {|line|
case @mode
when :waiting_index
if (line =~ /"(i[0-9]+)" : /)
@index = $1
@info[@index] = {}
@mode = :waiting_shard_no
end
when :waiting_shard_no
if (line =~ /"([0-9]+)" : \[/)
@shard = $1
@info[@index][@shard] = {}
@mode = :handling_shard
@subshard = 0
@info[@index][@shard][@subshard] = {}
end
when :handling_shard
if (line =~ /"primary" : (\w+),/)
@info[@index][@shard][@subshard][:primary] = $1
elsif (line =~ /"node" : "(\S+)",/)
@info[@index][@shard][@subshard][:node] = $1
elsif (line =~ /"id" : ([0-9]+)/)
@info[@index][@shard][@subshard][:translog_id] = $1
elsif (line =~ /"num_docs" : ([0-9]+)/)
@info[@index][@shard][@subshard][:num_docs] = $1
elsif (line =~ /^\s+\}, \{\s?$/)
@subshard += 1
@info[@index][@shard][@subshard] = {}
elsif (line =~ /^\s+\} \],\s?$/)
@mode = :waiting_shard_no
elsif (line =~ /^\s+\} \]\s?$/)
@mode = :waiting_index
end
end
}
}

def dump_info(info)
"tranalog=#{info[:translog_id]} num_docs=#{info[:num_docs]}
node=#{info[:node]}" +
(info[:primary] == "true" ? " primary" : " ")
end

def dump_diff(idx, shard, i, info, ref_i, ref_info)
idxShard = "#{idx}-#{shard}"
"#{idxShard} [#{ref_i}]: #{dump_info(ref_info)}\n" +
" " * idxShard.length + " [#{i}]: #{dump_info(info)}"
end

def sort_keys(col)
col.keys.sort {|a,b|
if (a.class != String || a.length == b.length)
a <=> b
else
a.length <=> b.length
end
}
end

sort_keys(@info).each {|idx| idx_info = @info[idx]
sort_keys(idx_info).each {|shard| shard_info = idx_info[shard]
translogs = []
sort_keys(shard_info).each {|i| info = shard_info[i]
if (translogs.empty?)
translogs << { i => info }
else
if (info[:num_docs] != translogs[0].values[0][:num_docs])
puts dump_diff(idx, shard, i, info, translogs[0].keys[0],
translogs[0].values[0]) + " DIFFERENT COUNT"
elsif (info[:translog_id] != translogs[0].values[0][:translog_id])
puts dump_diff(idx, shard, i, info, translogs[0].keys[0],
translogs[0].values[0])
end
end
}
}
}




--
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

es_learner
In reply to this post by arta
May I know which version of ES you are using?  I'm on 0.19.2 and have been hitting primary only. BUT lately, because of increased traffic causing high CPU spikes, I am planning to load-balance reads across my 5 replicas, 3 servers cluster.  Reading this thread gives me pause.  Any new info will help me greatly.

Thanks.

curl localhost:9200/?version
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

kimchy
Administrator
In reply to this post by Kurt Harriger
The translation id does not indicate "consistency" between shards, its internal to each shard. Which version are you using? Also, anything interesting in the logs (failures)?

On Thursday, September 20, 2012 7:34:00 PM UTC+2, Kurt Harriger wrote:
I assume that the translog/id represents the position in the transaction log.  If the replica does not have the same id then I would assume it is still catching up to the master and one would expect that the num_docs may differ as the replica is still replaying changes.  However, if they are at the same position in the transaction log and the num_docs is different then I would assume that the replica should have the same num_docs as the master and if not then why not… that is the question.  I still haven't had any time to dig into the elastic search source code so that is just my assumptions, perhaps a committer could clarify. 

-- 
Kurt Harriger
Sent with Sparrow

On Thursday, September 20, 2012 at 11:05 AM, arta wrote:

Thanks, again, Kurt,
I looked at _status output and I think what I found was inconsistency
disappears over time.
I wrote a simple ruby script to process _status?pretty=true output (see
below)
I took a couple of sample data and saw many shards have different translog
id but the num_docs seem ok.
What does translog id actually mean?
What is your concern if replicas have different translog id?

One more question for anybody knows ES well:
Some document seemed to take hours until it appeared in the replica (or
maybe viceversa).
What brings this time lag?

----------------------------------------- here's my script --------------
if (ARGV.length < 1)
puts $PROGRAM_NAME + ' <input file>'
exit
end

@mode = :waiting_index
@info = {}

open(ARGV[0], 'r') {|f|
f.each_line {|line|
case @mode
when :waiting_index
if (line =~ /"(i[0-9]+)" : /)
@index = $1
@info[@index] = {}
@mode = :waiting_shard_no
end
when :waiting_shard_no
if (line =~ /"([0-9]+)" : \[/)
@shard = $1
@info[@index][@shard] = {}
@mode = :handling_shard
@subshard = 0
@info[@index][@shard][@subshard] = {}
end
when :handling_shard
if (line =~ /"primary" : (\w+),/)
@info[@index][@shard][@subshard][:primary] = $1
elsif (line =~ /"node" : "(\S+)",/)
@info[@index][@shard][@subshard][:node] = $1
elsif (line =~ /"id" : ([0-9]+)/)
@info[@index][@shard][@subshard][:translog_id] = $1
elsif (line =~ /"num_docs" : ([0-9]+)/)
@info[@index][@shard][@subshard][:num_docs] = $1
elsif (line =~ /^\s+\}, \{\s?$/)
@subshard += 1
@info[@index][@shard][@subshard] = {}
elsif (line =~ /^\s+\} \],\s?$/)
@mode = :waiting_shard_no
elsif (line =~ /^\s+\} \]\s?$/)
@mode = :waiting_index
end
end
}
}

def dump_info(info)
"tranalog=#{info[:translog_id]} num_docs=#{info[:num_docs]}
node=#{info[:node]}" +
(info[:primary] == "true" ? " primary" : " ")
end

def dump_diff(idx, shard, i, info, ref_i, ref_info)
idxShard = "#{idx}-#{shard}"
"#{idxShard} [#{ref_i}]: #{dump_info(ref_info)}\n" +
" " * idxShard.length + " [#{i}]: #{dump_info(info)}"
end

def sort_keys(col)
col.keys.sort {|a,b|
if (a.class != String || a.length == b.length)
a <=> b
else
a.length <=> b.length
end
}
end

sort_keys(@info).each {|idx| idx_info = @info[idx]
sort_keys(idx_info).each {|shard| shard_info = idx_info[shard]
translogs = []
sort_keys(shard_info).each {|i| info = shard_info[i]
if (translogs.empty?)
translogs << { i => info }
else
if (info[:num_docs] != translogs[0].values[0][:num_docs])
puts dump_diff(idx, shard, i, info, translogs[0].keys[0],
translogs[0].values[0]) + " DIFFERENT COUNT"
elsif (info[:translog_id] != translogs[0].values[0][:translog_id])
puts dump_diff(idx, shard, i, info, translogs[0].keys[0],
translogs[0].values[0])
end
end
}
}
}




--
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

arta
In my case, 0.19.3 it is.
I found the discrepancy disappeared after a while.
In some case it took days, but most of cases within an hour.
I don't see any related entry in logs.

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Kurt Harriger
Using version 0.19.8 here. I haven't looked much deeper into the issue since adding preference=_primary_first.  Without primary_first the hit count would change on nearly every query.  Given that same transaction id does not indicate that the replica is caught up with the master its possible that the issue might resolve itself given enough time, however when we first encountered the hit counts alternating between values the issue persisted for two days on test accounts so I don't think it is likely that time alone would have resolved it.  

How does one determine if the shards are inconsistent or just behind?  I don't know, but it would be nice if there was a way to get a more definitive answer to this question. 

-- 
Kurt Harriger
Sent with Sparrow

On Thursday, October 18, 2012 at 11:34 AM, arta wrote:

In my case, 0.19.3 it is.
I found the discrepancy disappeared after a while.
In some case it took days, but most of cases within an hour.
I don't see any related entry in logs.

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Clinton Gormley-2

> How does one determine if the shards are inconsistent or just behind?
> I don't know, but it would be nice if there was a way to get a more
> definitive answer to this question.

The shards should be neither inconsistent nor behind.  The one exception
might be where you use 'async' indexing.

You can use the cluster_health API (with eg level=shards) to get a view
of your cluster.

clint

>

--


Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

kimchy
Administrator
One more thing, make sure not to confuse primary-replica inconsistency with "counts" being different. Let me explain, but first, note that 0.19.5 and later 0.19.7 fixed bugs that might resolve in inconsistencies (though under quite extreme cases, i.e. only managed to recreate it in a test that ran for 5 days with nodes constantly being killed).

Back to the "inconsistency" part, if you keep on indexing into an index, obviously, you will see some "inconsistencies" between calls as data keeps being added to the index. Also, those changes will be visible as shards will be refreshed (by default, 1s by default).

Also, when executing a search, with sorting, for example, based on _score (the default), some docs will have the same _score, and there isn't consistency there (unless using additional sort field). So, when executing it "once" and then another time, it might hit other shard copies, and sorting there based on same _score value would be different.

Note, the above problems, even though both shards have the same data, it might seem like they are giving different results.

How do you solve it? The simplest way that I personally like is to use the preference option, but using a dynamic value. If you do preference=[user_id] (for example), then for the same user id, the same shard copies will be hit.

On Oct 19, 2012, at 12:37 PM, Clinton Gormley <[hidden email]> wrote:

>
>> How does one determine if the shards are inconsistent or just behind?
>> I don't know, but it would be nice if there was a way to get a more
>> definitive answer to this question.
>
> The shards should be neither inconsistent nor behind.  The one exception
> might be where you use 'async' indexing.
>
> You can use the cluster_health API (with eg level=shards) to get a view
> of your cluster.
>
> clint
>
>>
>
> --
>
>

--


Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Filirom1
I confirm the issue on ElasticSearch 0.19.11 .

With the bulk API, I reindex all my data. Now the indexation has finished but the total number of hits is different on primary and replica : 

?preference=_primary_first
hits.total: 12209124

?preference=_replica_first
hits.total: 12209202

I think this problem is the same as this one : 

When I try to index the same bulk of documents (on an empty index) sometimes I have an error and sometimes not.

I think this is the cause of inconsitency between primary and replica.

Cheers
Romain


2012/10/21 <[hidden email]>
One more thing, make sure not to confuse primary-replica inconsistency with "counts" being different. Let me explain, but first, note that 0.19.5 and later 0.19.7 fixed bugs that might resolve in inconsistencies (though under quite extreme cases, i.e. only managed to recreate it in a test that ran for 5 days with nodes constantly being killed).

Back to the "inconsistency" part, if you keep on indexing into an index, obviously, you will see some "inconsistencies" between calls as data keeps being added to the index. Also, those changes will be visible as shards will be refreshed (by default, 1s by default).

Also, when executing a search, with sorting, for example, based on _score (the default), some docs will have the same _score, and there isn't consistency there (unless using additional sort field). So, when executing it "once" and then another time, it might hit other shard copies, and sorting there based on same _score value would be different.

Note, the above problems, even though both shards have the same data, it might seem like they are giving different results.

How do you solve it? The simplest way that I personally like is to use the preference option, but using a dynamic value. If you do preference=[user_id] (for example), then for the same user id, the same shard copies will be hit.

On Oct 19, 2012, at 12:37 PM, Clinton Gormley <[hidden email]> wrote:

>
>> How does one determine if the shards are inconsistent or just behind?
>> I don't know, but it would be nice if there was a way to get a more
>> definitive answer to this question.
>
> The shards should be neither inconsistent nor behind.  The one exception
> might be where you use 'async' indexing.
>
> You can use the cluster_health API (with eg level=shards) to get a view
> of your cluster.
>
> clint
>
>>
>
> --
>
>

--



--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

joergprante@gmail.com
Hi Filirom1,

if you see mapping exceptions, something in your JSON data style is inconsistent, see my comment
https://github.com/elasticsearch/elasticsearch/issues/2354#issuecomment-10453428

Jörg

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Filirom1
Yes I know, but I can't change what the users inject in ElasticSearch.

The point is that sometimes an inconsistent JSON is accepted by ES.


2012/11/16 Jörg Prante <[hidden email]>
Hi Filirom1,

if you see mapping exceptions, something in your JSON data style is inconsistent, see my comment

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Tal Shemesh
Hi,

we are facing the same issue with 0.90.11.
we have a shard that it's primary size is 1.9gb and the replica is 1gb.
did you manage to solve the problem? 
if so, how can we fix it?


On Friday, November 16, 2012 7:21:16 PM UTC+2, Filirom1 wrote:
Yes I know, but I can't change what the users inject in ElasticSearch.

The point is that sometimes an inconsistent JSON is accepted by ES.


2012/11/16 Jörg Prante <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="XqrHQEzgzzMJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">joerg...@...>
Hi Filirom1,

if you see mapping exceptions, something in your JSON data style is inconsistent, see my comment
<a href="https://github.com/elasticsearch/elasticsearch/issues/2354#issuecomment-10453428" target="_blank" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F2354%23issuecomment-10453428\46sa\75D\46sntz\0751\46usg\75AFQjCNEKX8gOtIxXVYYkthGkbC3WG2Qfng';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F2354%23issuecomment-10453428\46sa\75D\46sntz\0751\46usg\75AFQjCNEKX8gOtIxXVYYkthGkbC3WG2Qfng';return true;">https://github.com/elasticsearch/elasticsearch/issues/2354#issuecomment-10453428

Jörg

--
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/86314b57-e292-491e-94ac-92a5a35b8344%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: How to fix primary-replica inconsistency?

Adrien Grand-2
Hi,

A difference in disk size doesn't mean that they don't have the same content since one of the replicas might just have run a large merge that saved disk space. Nevertheless, you can force shards to be re-replicated by using the update setting API[1] to temporarily set the number of replicas to 0 (this will deallocate replicas) and then back to the original value (which will cause replicas to be bulk-copied from the primaries).

[1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html#indices-update-settings


On Sun, Apr 13, 2014 at 8:41 AM, Tal Shemesh <[hidden email]> wrote:
Hi,

we are facing the same issue with 0.90.11.
we have a shard that it's primary size is 1.9gb and the replica is 1gb.
did you manage to solve the problem? 
if so, how can we fix it?


On Friday, November 16, 2012 7:21:16 PM UTC+2, Filirom1 wrote:
Yes I know, but I can't change what the users inject in ElasticSearch.

The point is that sometimes an inconsistent JSON is accepted by ES.


2012/11/16 Jörg Prante <[hidden email]>
Hi Filirom1,

if you see mapping exceptions, something in your JSON data style is inconsistent, see my comment

--
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/86314b57-e292-491e-94ac-92a5a35b8344%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6ofT81ApiPMnx%2BFvogfp_sHQyqKWMmTStDWFq%2B5Cvttg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.