how to avoid/lighten shard recovery after restart?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

how to avoid/lighten shard recovery after restart?

R. Toma
Hi group,

Restarting a ES cluster triggers recovery which is long-lasting and load expensive. I am searching for a way to reduce the runtime and load of a restart. I read someone executes daily rolling restarts of his large ES cluster to ensure the primary and replica shards are 100% indentical, meaning they will be fast recoverable. But that sounds like a hack and not something you should happy with as SRE. And its impact on ES performance may be acceptable on a large cluster, but not on our 3 node cluster.

How I believe shard recovery works: if ES spots differences between a primary and its replica shard(s). It will rebuild the replica shard(s) as an exact copy of the primary shard. Rebuiding results in lots of network traffic and disk I/O.

We have a 3-node ES 1.0.1 cluster with 3k primary shards and 3k replica shards. During a recent restart (to reduce heapsize to 31G to get CompressedOops back) the recovery of the 1st node took the longest time (~6 hours). Recovery and the 2nd less (~2 hours) and the 3rd is quick (<1 hour). I believe recovery becomes faster after each node, because each recovery ends with more replica shards as exact copies of their primary.

I tried force-merging with an expensive max_num_segments=1, but the metrics segments.count + segments.memory of same shards still differ between pri + rep. No luck. For the curious few I have included the before + after results below.

Any ideas?

Regards,
Renzo


BEFORE:
idx                            shard prirep docs  store segments.count segments.memory 
logstash-pro-oracle-2014.04.24 0     p      1072 485592              8           14615 
logstash-pro-oracle-2014.04.24 0     r      1072 449022              1           11958 
logstash-pro-oracle-2014.04.24 1     p      1095 493774              7           14336 
logstash-pro-oracle-2014.04.24 1     r      1095 459966              1           11988 
logstash-pro-oracle-2014.04.24 2     p      1039 452078              5           13158 
logstash-pro-oracle-2014.04.24 2     r      1039 458513              6           13480 
logstash-pro-oracle-2014.04.24 3     p      1094 492753              8           14574 
logstash-pro-oracle-2014.04.24 3     r      1094 483347              6           13850 
logstash-pro-oracle-2014.04.24 4     p      1099 494740              8           14645 
logstash-pro-oracle-2014.04.24 4     r      1099 488953              7           14251 

AFTER:
idx                            shard prirep docs  store segments.count segments.memory 
logstash-pro-oracle-2014.04.24 0     p      1072 449358              1           11958 
logstash-pro-oracle-2014.04.24 0     r      1072 448884              1           11958 
logstash-pro-oracle-2014.04.24 1     p      1095 460391              1           11980 
logstash-pro-oracle-2014.04.24 1     r      1095 459918              1           11988 <-- rep is 8 bigger than its pri
logstash-pro-oracle-2014.04.24 2     p      1039 431341              1           11580 
logstash-pro-oracle-2014.04.24 2     r      1039 431695              1           11572 <-- rep is 8 smaller than its pri
logstash-pro-oracle-2014.04.24 3     p      1094 457135              1           11907 
logstash-pro-oracle-2014.04.24 3     r      1094 457970              1           11907 
logstash-pro-oracle-2014.04.24 4     p      1099 457640              1           11957 
logstash-pro-oracle-2014.04.24 4     r      1099 457165              1           11957 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/20ca0eb4-1f62-4465-a289-2ecd740c9c2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: how to avoid/lighten shard recovery after restart?

Binh Ly-2
Although you cannot completely eliminate these recovery/comparisons at the moment, there are some things you can do today that may help. If you go through this presentation, it talks about some settings on when ES should start recovery after full restart:

http://www.elasticsearch.org/webinars/elasticsearch-pre-flight-checklist/

It is also possible to disable allocation before shutdown and then re-enabling after you are fully back up. 

And in the future, there is current work that is being done to make this recovery process more efficient (see short description about Sequence Numbers):

http://www.elasticsearch.org/blog/resiliency-elasticsearch/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/56acf4b2-dc40-4cef-a1de-233c250156bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.