apparent infinite loop in node recovery

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

apparent infinite loop in node recovery

Grant Rodgers
We had an incident yesterday where a node ran out of memory while
indexing and the elasticsearch process was killed. Now i'm trying to
recover today and noticing what seems to be an infinite loop in the
recovery process log.

Here's what happens in the log after a restart (filtered to only show
one shard):

[22:59:07,944][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,046][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,047][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,246][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:09,247][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,356][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]

and many more lines like that, for each shard. ~30000 debug lines from
index.shard.service in less than 3 minutes (before i stopped the
process).

Is this normal? is it a corruption error? it seems to happen
regardless of whether there is a work directory or not.
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
Thats does not indicate an infinte loop. The recovery process (in this case, recovery of a replica shard from the primary shard) is throttled on the node level. Basically, in order to not tax the machine the node is running on, recoveries are throttled (both on the number of concurrent recoveries allowed, and on the number of concurrent transfers allowed).

A recovery started my moving the shard to RECOVERING state, and then checks if it is allowed to perform a recovery. If not, either because the it is being throttled on the current node, or the node it recovers from, it restores the previous non recovering state, and retries after an interval (by default, 100ms).

That process can create a lot of DEBUG level logs (I need to rethink the logging aspect here), but should finish successfully. It should not indicate a problem.

-shay.banon

On Thu, Aug 12, 2010 at 2:10 AM, Grant Rodgers <[hidden email]> wrote:
We had an incident yesterday where a node ran out of memory while
indexing and the elasticsearch process was killed. Now i'm trying to
recover today and noticing what seems to be an infinite loop in the
recovery process log.

Here's what happens in the log after a restart (filtered to only show
one shard):

[22:59:07,944][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,046][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,047][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,246][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]
[22:59:09,247][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [RECOVERING]->[CREATED], restored after recovery
[22:59:09,356][DEBUG][index.shard.service      ] [Joystick] [stats][4]
state: [CREATED]->[RECOVERING]

and many more lines like that, for each shard. ~30000 debug lines from
index.shard.service in less than 3 minutes (before i stopped the
process).

Is this normal? is it a corruption error? it seems to happen
regardless of whether there is a work directory or not.

Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
Ok, good to know. I will let it run and see if it finishes
successfully. thanks!

On Aug 11, 4:19 pm, Shay Banon <[hidden email]> wrote:

> Thats does not indicate an infinte loop. The recovery process (in this case,
> recovery of a replica shard from the primary shard) is throttled on the node
> level. Basically, in order to not tax the machine the node is running on,
> recoveries are throttled (both on the number of concurrent recoveries
> allowed, and on the number of concurrent transfers allowed).
>
> A recovery started my moving the shard to RECOVERING state, and then checks
> if it is allowed to perform a recovery. If not, either because the it is
> being throttled on the current node, or the node it recovers from, it
> restores the previous non recovering state, and retries after an interval
> (by default, 100ms).
>
> That process can create a lot of DEBUG level logs (I need to rethink the
> logging aspect here), but should finish successfully. It should not indicate
> a problem.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 2:10 AM, Grant Rodgers <[hidden email]> wrote:
> > We had an incident yesterday where a node ran out of memory while
> > indexing and the elasticsearch process was killed. Now i'm trying to
> > recover today and noticing what seems to be an infinite loop in the
> > recovery process log.
>
> > Here's what happens in the log after a restart (filtered to only show
> > one shard):
>
> > [22:59:07,944][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,046][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,047][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,246][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:09,247][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,356][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
>
> > and many more lines like that, for each shard. ~30000 debug lines from
> > index.shard.service in less than 3 minutes (before i stopped the
> > process).
>
> > Is this normal? is it a corruption error? it seems to happen
> > regardless of whether there is a work directory or not.
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
I've pushed a fix for the excessive logging problem.

-shay.banon

On Thu, Aug 12, 2010 at 2:24 AM, Grant Rodgers <[hidden email]> wrote:
Ok, good to know. I will let it run and see if it finishes
successfully. thanks!

On Aug 11, 4:19 pm, Shay Banon <[hidden email]> wrote:
> Thats does not indicate an infinte loop. The recovery process (in this case,
> recovery of a replica shard from the primary shard) is throttled on the node
> level. Basically, in order to not tax the machine the node is running on,
> recoveries are throttled (both on the number of concurrent recoveries
> allowed, and on the number of concurrent transfers allowed).
>
> A recovery started my moving the shard to RECOVERING state, and then checks
> if it is allowed to perform a recovery. If not, either because the it is
> being throttled on the current node, or the node it recovers from, it
> restores the previous non recovering state, and retries after an interval
> (by default, 100ms).
>
> That process can create a lot of DEBUG level logs (I need to rethink the
> logging aspect here), but should finish successfully. It should not indicate
> a problem.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 2:10 AM, Grant Rodgers <[hidden email]> wrote:
> > We had an incident yesterday where a node ran out of memory while
> > indexing and the elasticsearch process was killed. Now i'm trying to
> > recover today and noticing what seems to be an infinite loop in the
> > recovery process log.
>
> > Here's what happens in the log after a restart (filtered to only show
> > one shard):
>
> > [22:59:07,944][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,046][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,047][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,246][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
> > [22:59:09,247][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [RECOVERING]->[CREATED], restored after recovery
> > [22:59:09,356][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > state: [CREATED]->[RECOVERING]
>
> > and many more lines like that, for each shard. ~30000 debug lines from
> > index.shard.service in less than 3 minutes (before i stopped the
> > process).
>
> > Is this normal? is it a corruption error? it seems to happen
> > regardless of whether there is a work directory or not.

Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
Fyi, 19 hours later, there is no change in the state of the cluster,
and the logs are still bouncing between created and recovering. We
will delete and recreate the missing indices, but it seems like
recovery should have happened by now. What is the expected recovery
time for a shard with a few hundred thousand documents?

Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
b6457780f145 node had been restarted a few times, and maybe that
caused some shards to get stuck in the relocating state? Still, I'd
expect that at least one shard would have recovered by now.

"routing_table" : {
    "indices" : {
      "versions" : {
        "shards" : {
          "0" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "versions"
          } ],
          "1" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 1,
            "index" : "versions"
          } ],
          "2" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 2,
            "index" : "versions"
          } ],
          "3" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 3,
            "index" : "versions"
          } ],
          "4" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 4,
            "index" : "versions"
          } ],
          "5" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 5,
            "index" : "versions"
          } ]
        }
      },
      "stats" : {
        "shards" : {
          "0" : [ {
            "state" : "STARTED",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "stats"
          }, {
            "state" : "STARTED",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "stats"
          }, {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "stats"
          } ],
          "1" : [ {
            "state" : "STARTED",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "stats"
          }, {
            "state" : "STARTED",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "stats"
          }, {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "stats"
          } ],
          "2" : [ {
            "state" : "STARTED",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "stats"
          }, {
            "state" : "STARTED",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "stats"
          }, {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "stats"
          } ],
          "3" : [ {
            "state" : "STARTED",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "stats"
          }, {
            "state" : "STARTED",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "stats"
          }, {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "stats"
          } ],
          "4" : [ {
            "state" : "STARTED",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "stats"
          }, {
            "state" : "STARTED",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "stats"
          }, {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "stats"
          } ],
          "5" : [ {
            "state" : "STARTED",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "stats"
          }, {
            "state" : "STARTED",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "stats"
          }, {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "stats"
          } ]
        }
      },
      "revisions" : {
        "shards" : {
          "0" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "revisions"
          } ],
          "1" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 1,
            "index" : "revisions"
          } ],
          "2" : [ {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "revisions"
          }, {
            "state" : "RELOCATING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
            "shard" : 2,
            "index" : "revisions"
          } ],
          "3" : [ {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "revisions"
          }, {
            "state" : "RELOCATING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
            "shard" : 3,
            "index" : "revisions"
          } ],
          "4" : [ {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "revisions"
          }, {
            "state" : "RELOCATING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
            "shard" : 4,
            "index" : "revisions"
          } ],
          "5" : [ {
            "state" : "INITIALIZING",
            "primary" : false,
            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "revisions"
          }, {
            "state" : "RELOCATING",
            "primary" : true,
            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
            "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
            "shard" : 5,
            "index" : "revisions"
          } ]
        }
      }
    }
  },
  "routing_nodes" : {
    "unassigned" : [ {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 1,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 2,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 3,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 4,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 5,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "revisions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 1,
      "index" : "revisions"
    } ],
    "nodes" : {
      "b411c443-ebdd-416a-ac35-cce5af38fa3f" : [ {
        "state" : "STARTED",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "revisions"
      } ],
      "a4eada23-ab66-4ab5-bf20-8de09e25a6a3" : [ {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "versions"
      }, {
        "state" : "STARTED",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "stats"
      }, {
        "state" : "STARTED",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "revisions"
      }, {
        "state" : "RELOCATING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "shard" : 2,
        "index" : "revisions"
      }, {
        "state" : "RELOCATING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "shard" : 3,
        "index" : "revisions"
      }, {
        "state" : "RELOCATING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "shard" : 4,
        "index" : "revisions"
      }, {
        "state" : "RELOCATING",
        "primary" : true,
        "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "shard" : 5,
        "index" : "revisions"
      } ],
      "dacbe3f4-ef13-43ba-aa13-b6457780f145" : [ {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : false,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "shard" : 2,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "shard" : 3,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "shard" : 4,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
        "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
        "shard" : 5,
        "index" : "revisions"
      } ]
    }


On Aug 12, 3:14 am, Shay Banon <[hidden email]> wrote:

> I've pushed a fix for the excessive logging problem.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 2:24 AM, Grant Rodgers <[hidden email]> wrote:
> > Ok, good to know. I will let it run and see if it finishes
> > successfully. thanks!
>
> > On Aug 11, 4:19 pm, Shay Banon <[hidden email]> wrote:
> > > Thats does not indicate an infinte loop. The recovery process (in this
> > case,
> > > recovery of a replica shard from the primary shard) is throttled on the
> > node
> > > level. Basically, in order to not tax the machine the node is running on,
> > > recoveries are throttled (both on the number of concurrent recoveries
> > > allowed, and on the number of concurrent transfers allowed).
>
> > > A recovery started my moving the shard to RECOVERING state, and then
> > checks
> > > if it is allowed to perform a recovery. If not, either because the it is
> > > being throttled on the current node, or the node it recovers from, it
> > > restores the previous non recovering state, and retries after an interval
> > > (by default, 100ms).
>
> > > That process can create a lot of DEBUG level logs (I need to rethink the
> > > logging aspect here), but should finish successfully. It should not
> > indicate
> > > a problem.
>
> > > -shay.banon
>
> > > On Thu, Aug 12, 2010 at 2:10 AM, Grant Rodgers <[hidden email]> wrote:
> > > > We had an incident yesterday where a node ran out of memory while
> > > > indexing and the elasticsearch process was killed. Now i'm trying to
> > > > recover today and noticing what seems to be an infinite loop in the
> > > > recovery process log.
>
> > > > Here's what happens in the log after a restart (filtered to only show
> > > > one shard):
>
> > > > [22:59:07,944][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,046][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,047][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,246][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:09,247][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,356][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
>
> > > > and many more lines like that, for each shard. ~30000 debug lines from
> > > > index.shard.service in less than 3 minutes (before i stopped the
> > > > process).
>
> > > > Is this normal? is it a corruption error? it seems to happen
> > > > regardless of whether there is a work directory or not.
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
If I remember correctly, you use the s3 gateway, right? It seems like some primary shards are stuck initializing (basically, recovering from the gateway).  I forgot to ask, which version are you using?

-shay.banon

On Thu, Aug 12, 2010 at 9:01 PM, Grant Rodgers <[hidden email]> wrote:
Fyi, 19 hours later, there is no change in the state of the cluster,
and the logs are still bouncing between created and recovering. We
will delete and recreate the missing indices, but it seems like
recovery should have happened by now. What is the expected recovery
time for a shard with a few hundred thousand documents?

Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
b6457780f145 node had been restarted a few times, and maybe that
caused some shards to get stuck in the relocating state? Still, I'd
expect that at least one shard would have recovered by now.

"routing_table" : {
   "indices" : {
     "versions" : {
       "shards" : {
         "0" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 0,
           "index" : "versions"
         } ],
         "1" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 1,
           "index" : "versions"
         } ],
         "2" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 2,
           "index" : "versions"
         } ],
         "3" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 3,
           "index" : "versions"
         } ],
         "4" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 4,
           "index" : "versions"
         } ],
         "5" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 5,
           "index" : "versions"
         } ]
       }
     },
     "stats" : {
       "shards" : {
         "0" : [ {
           "state" : "STARTED",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "stats"
         }, {
           "state" : "STARTED",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "stats"
         }, {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "stats"
         } ],
         "1" : [ {
           "state" : "STARTED",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "stats"
         }, {
           "state" : "STARTED",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "stats"
         }, {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "stats"
         } ],
         "2" : [ {
           "state" : "STARTED",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "stats"
         }, {
           "state" : "STARTED",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "stats"
         }, {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "stats"
         } ],
         "3" : [ {
           "state" : "STARTED",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "stats"
         }, {
           "state" : "STARTED",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "stats"
         }, {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "stats"
         } ],
         "4" : [ {
           "state" : "STARTED",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "stats"
         }, {
           "state" : "STARTED",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "stats"
         }, {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "stats"
         } ],
         "5" : [ {
           "state" : "STARTED",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "stats"
         }, {
           "state" : "STARTED",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "stats"
         }, {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "stats"
         } ]
       }
     },
     "revisions" : {
       "shards" : {
         "0" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 0,
           "index" : "revisions"
         } ],
         "1" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 1,
           "index" : "revisions"
         } ],
         "2" : [ {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "revisions"
         }, {
           "state" : "RELOCATING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
           "shard" : 2,
           "index" : "revisions"
         } ],
         "3" : [ {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "revisions"
         }, {
           "state" : "RELOCATING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
           "shard" : 3,
           "index" : "revisions"
         } ],
         "4" : [ {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "revisions"
         }, {
           "state" : "RELOCATING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
           "shard" : 4,
           "index" : "revisions"
         } ],
         "5" : [ {
           "state" : "INITIALIZING",
           "primary" : false,
           "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "revisions"
         }, {
           "state" : "RELOCATING",
           "primary" : true,
           "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
           "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
b6457780f145",
           "shard" : 5,
           "index" : "revisions"
         } ]
       }
     }
   }
 },
 "routing_nodes" : {
   "unassigned" : [ {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 0,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 1,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 2,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 3,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 4,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 5,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 0,
     "index" : "revisions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 1,
     "index" : "revisions"
   } ],
   "nodes" : {
     "b411c443-ebdd-416a-ac35-cce5af38fa3f" : [ {
       "state" : "STARTED",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "revisions"
     } ],
     "a4eada23-ab66-4ab5-bf20-8de09e25a6a3" : [ {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "versions"
     }, {
       "state" : "STARTED",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "stats"
     }, {
       "state" : "STARTED",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "revisions"
     }, {
       "state" : "RELOCATING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "shard" : 2,
       "index" : "revisions"
     }, {
       "state" : "RELOCATING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "shard" : 3,
       "index" : "revisions"
     }, {
       "state" : "RELOCATING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "shard" : 4,
       "index" : "revisions"
     }, {
       "state" : "RELOCATING",
       "primary" : true,
       "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "relocating_node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "shard" : 5,
       "index" : "revisions"
     } ],
     "dacbe3f4-ef13-43ba-aa13-b6457780f145" : [ {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : false,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "shard" : 2,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "shard" : 3,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "shard" : 4,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
       "relocating_node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
       "shard" : 5,
       "index" : "revisions"
     } ]
   }


On Aug 12, 3:14 am, Shay Banon <[hidden email]> wrote:
> I've pushed a fix for the excessive logging problem.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 2:24 AM, Grant Rodgers <[hidden email]> wrote:
> > Ok, good to know. I will let it run and see if it finishes
> > successfully. thanks!
>
> > On Aug 11, 4:19 pm, Shay Banon <[hidden email]> wrote:
> > > Thats does not indicate an infinte loop. The recovery process (in this
> > case,
> > > recovery of a replica shard from the primary shard) is throttled on the
> > node
> > > level. Basically, in order to not tax the machine the node is running on,
> > > recoveries are throttled (both on the number of concurrent recoveries
> > > allowed, and on the number of concurrent transfers allowed).
>
> > > A recovery started my moving the shard to RECOVERING state, and then
> > checks
> > > if it is allowed to perform a recovery. If not, either because the it is
> > > being throttled on the current node, or the node it recovers from, it
> > > restores the previous non recovering state, and retries after an interval
> > > (by default, 100ms).
>
> > > That process can create a lot of DEBUG level logs (I need to rethink the
> > > logging aspect here), but should finish successfully. It should not
> > indicate
> > > a problem.
>
> > > -shay.banon
>
> > > On Thu, Aug 12, 2010 at 2:10 AM, Grant Rodgers <[hidden email]> wrote:
> > > > We had an incident yesterday where a node ran out of memory while
> > > > indexing and the elasticsearch process was killed. Now i'm trying to
> > > > recover today and noticing what seems to be an infinite loop in the
> > > > recovery process log.
>
> > > > Here's what happens in the log after a restart (filtered to only show
> > > > one shard):
>
> > > > [22:59:07,944][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,046][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,047][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,156][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,266][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,374][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,476][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,586][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,696][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,806][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:08,916][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:09,026][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:09,136][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,246][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
> > > > [22:59:09,247][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [RECOVERING]->[CREATED], restored after recovery
> > > > [22:59:09,356][DEBUG][index.shard.service      ] [Joystick] [stats][4]
> > > > state: [CREATED]->[RECOVERING]
>
> > > > and many more lines like that, for each shard. ~30000 debug lines from
> > > > index.shard.service in less than 3 minutes (before i stopped the
> > > > process).
>
> > > > Is this normal? is it a corruption error? it seems to happen
> > > > regardless of whether there is a work directory or not.

Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
Yes, we are using the s3 gateway and version 0.9.0.

Would we get better gateway performance by using the fs gateway and a
network file system, like gluster, drbd, or ceph?

On Aug 12, 11:34 am, Shay Banon <[hidden email]> wrote:

> If I remember correctly, you use the s3 gateway, right? It seems like some
> primary shards are stuck initializing (basically, recovering from the
> gateway).  I forgot to ask, which version are you using?
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 9:01 PM, Grant Rodgers <[hidden email]> wrote:
> > Fyi, 19 hours later, there is no change in the state of the cluster,
> > and the logs are still bouncing between created and recovering. We
> > will delete and recreate the missing indices, but it seems like
> > recovery should have happened by now. What is the expected recovery
> > time for a shard with a few hundred thousand documents?
>
> > Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
> > b6457780f145 node had been restarted a few times, and maybe that
> > caused some shards to get stuck in the relocating state? Still, I'd
> > expect that at least one shard would have recovered by now.
>
> > "routing_table" : {
> >    "indices" : {
> >      "versions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          } ],
> >          "3" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          } ],
> >          "4" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          } ],
> >          "5" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          } ]
> >        }
> >      },
> >      "stats" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          } ],
> >          "1" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          } ],
> >          "2" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          } ],
> >          "3" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          } ],
> >          "4" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          } ],
> >          "5" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          } ]
> >        }
> >      },
> >      "revisions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "RELOCATING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
> > b6457780f145",
> >            "shard" : 2,
> >            "index" : "revisions"
> >          } ],
> >          "3" : [ {
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
It depends on the actual implementation. In general, the time it takes to do recovery is how fast we can get data from s3 when a full recovery from it is required (elasticsearch tries to minimize that by reusing the work dir, though you should set the recover_after_nodes setting).

My concern is that maybe the recovery process is stuck somewhere. Not sure where though... . Is there a chance for a thread dump on one of the nodes that have a primary shard in INITIALIZING state? You can do that with visualvm.

-shay.banon

On Thu, Aug 12, 2010 at 9:50 PM, Grant Rodgers <[hidden email]> wrote:
Yes, we are using the s3 gateway and version 0.9.0.

Would we get better gateway performance by using the fs gateway and a
network file system, like gluster, drbd, or ceph?

On Aug 12, 11:34 am, Shay Banon <[hidden email]> wrote:
> If I remember correctly, you use the s3 gateway, right? It seems like some
> primary shards are stuck initializing (basically, recovering from the
> gateway).  I forgot to ask, which version are you using?
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 9:01 PM, Grant Rodgers <[hidden email]> wrote:
> > Fyi, 19 hours later, there is no change in the state of the cluster,
> > and the logs are still bouncing between created and recovering. We
> > will delete and recreate the missing indices, but it seems like
> > recovery should have happened by now. What is the expected recovery
> > time for a shard with a few hundred thousand documents?
>
> > Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
> > b6457780f145 node had been restarted a few times, and maybe that
> > caused some shards to get stuck in the relocating state? Still, I'd
> > expect that at least one shard would have recovered by now.
>
> > "routing_table" : {
> >    "indices" : {
> >      "versions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          } ],
> >          "3" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          } ],
> >          "4" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          } ],
> >          "5" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          } ]
> >        }
> >      },
> >      "stats" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          } ],
> >          "1" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          } ],
> >          "2" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          } ],
> >          "3" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          } ],
> >          "4" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          } ],
> >          "5" : [ {
> >            "state" : "STARTED",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          }, {
> >            "state" : "STARTED",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          }, {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          } ]
> >        }
> >      },
> >      "revisions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : false,
> >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "RELOCATING",
> >            "primary" : true,
> >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> >            "relocating_node" : "dacbe3f4-ef13-43ba-aa13-
> > b6457780f145",
> >            "shard" : 2,
> >            "index" : "revisions"
> >          } ],
> >          "3" : [ {
>
> ...
>
> read more »

Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
In reply to this post by Grant Rodgers
Now I've done a full cluster restart, deleting the work directories of
the corrupted indices so they would restore only from gateway again.
It seems to be working, but there are a few issues:

Some nodes appear to join the cluster, but sending a cluster health
request to it times out or returns MasterDisconnectedException (or
something like it, I don't have the exact exception name anymore). The
master definitely knows about these nodes though, because when they
stop it logs that they disconnected. Later, after the cluster has been
running for 10-15 minutes, the other nodes are able to connect and
respond to requests. In this case, I stopped them for a while and
restarted them later. I don't know if they would have become
operational if they'd stayed up.

It appears that local shards are not recovered before gateway shards.
We have one index with all shards present and intact in the work
directory, but those shards have not recovered yet after about 30
minutes, and other gateway shards are being recovered. Are shards
recovered sequentially? Do local shards get priority?

Thanks,
Grant

On Aug 12, 11:50 am, Grant Rodgers <[hidden email]> wrote:

> Yes, we are using the s3 gateway and version 0.9.0.
>
> Would we get better gateway performance by using the fs gateway and a
> network file system, like gluster, drbd, or ceph?
>
> On Aug 12, 11:34 am, Shay Banon <[hidden email]> wrote:
>
> > If I remember correctly, you use the s3 gateway, right? It seems like some
> > primary shards are stuck initializing (basically, recovering from the
> > gateway).  I forgot to ask, which version are you using?
>
> > -shay.banon
>
> > On Thu, Aug 12, 2010 at 9:01 PM, Grant Rodgers <[hidden email]> wrote:
> > > Fyi, 19 hours later, there is no change in the state of the cluster,
> > > and the logs are still bouncing between created and recovering. We
> > > will delete and recreate the missing indices, but it seems like
> > > recovery should have happened by now. What is the expected recovery
> > > time for a shard with a few hundred thousand documents?
>
> > > Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
> > > b6457780f145 node had been restarted a few times, and maybe that
> > > caused some shards to get stuck in the relocating state? Still, I'd
> > > expect that at least one shard would have recovered by now.
>
> > > "routing_table" : {
> > >    "indices" : {
> > >      "versions" : {
> > >        "shards" : {
> > >          "0" : [ {
> > >            "state" : "INITIALIZING",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 0,
> > >            "index" : "versions"
> > >          }, {
> > >            "state" : "UNASSIGNED",
> > >            "primary" : false,
> > >            "node" : null,
> > >            "relocating_node" : null,
> > >            "shard" : 0,
> > >            "index" : "versions"
> > >          } ],
> > >          "1" : [ {
> > >            "state" : "INITIALIZING",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 1,
> > >            "index" : "versions"
> > >          }, {
> > >            "state" : "UNASSIGNED",
> > >            "primary" : false,
> > >            "node" : null,
> > >            "relocating_node" : null,
> > >            "shard" : 1,
> > >            "index" : "versions"
> > >          } ],
> > >          "2" : [ {
> > >            "state" : "INITIALIZING",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 2,
> > >            "index" : "versions"
> > >          }, {
> > >            "state" : "UNASSIGNED",
> > >            "primary" : false,
> > >            "node" : null,
> > >            "relocating_node" : null,
> > >            "shard" : 2,
> > >            "index" : "versions"
> > >          } ],
> > >          "3" : [ {
> > >            "state" : "INITIALIZING",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 3,
> > >            "index" : "versions"
> > >          }, {
> > >            "state" : "UNASSIGNED",
> > >            "primary" : false,
> > >            "node" : null,
> > >            "relocating_node" : null,
> > >            "shard" : 3,
> > >            "index" : "versions"
> > >          } ],
> > >          "4" : [ {
> > >            "state" : "INITIALIZING",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 4,
> > >            "index" : "versions"
> > >          }, {
> > >            "state" : "UNASSIGNED",
> > >            "primary" : false,
> > >            "node" : null,
> > >            "relocating_node" : null,
> > >            "shard" : 4,
> > >            "index" : "versions"
> > >          } ],
> > >          "5" : [ {
> > >            "state" : "INITIALIZING",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 5,
> > >            "index" : "versions"
> > >          }, {
> > >            "state" : "UNASSIGNED",
> > >            "primary" : false,
> > >            "node" : null,
> > >            "relocating_node" : null,
> > >            "shard" : 5,
> > >            "index" : "versions"
> > >          } ]
> > >        }
> > >      },
> > >      "stats" : {
> > >        "shards" : {
> > >          "0" : [ {
> > >            "state" : "STARTED",
> > >            "primary" : false,
> > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > >            "relocating_node" : null,
> > >            "shard" : 0,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "STARTED",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 0,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "INITIALIZING",
> > >            "primary" : false,
> > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > >            "relocating_node" : null,
> > >            "shard" : 0,
> > >            "index" : "stats"
> > >          } ],
> > >          "1" : [ {
> > >            "state" : "STARTED",
> > >            "primary" : false,
> > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > >            "relocating_node" : null,
> > >            "shard" : 1,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "STARTED",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 1,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "INITIALIZING",
> > >            "primary" : false,
> > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > >            "relocating_node" : null,
> > >            "shard" : 1,
> > >            "index" : "stats"
> > >          } ],
> > >          "2" : [ {
> > >            "state" : "STARTED",
> > >            "primary" : false,
> > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > >            "relocating_node" : null,
> > >            "shard" : 2,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "STARTED",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 2,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "INITIALIZING",
> > >            "primary" : false,
> > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > >            "relocating_node" : null,
> > >            "shard" : 2,
> > >            "index" : "stats"
> > >          } ],
> > >          "3" : [ {
> > >            "state" : "STARTED",
> > >            "primary" : false,
> > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > >            "relocating_node" : null,
> > >            "shard" : 3,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "STARTED",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 3,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "INITIALIZING",
> > >            "primary" : false,
> > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > >            "relocating_node" : null,
> > >            "shard" : 3,
> > >            "index" : "stats"
> > >          } ],
> > >          "4" : [ {
> > >            "state" : "STARTED",
> > >            "primary" : false,
> > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > >            "relocating_node" : null,
> > >            "shard" : 4,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "STARTED",
> > >            "primary" : true,
> > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > >            "relocating_node" : null,
> > >            "shard" : 4,
> > >            "index" : "stats"
> > >          }, {
> > >            "state" : "INITIALIZING",
> > >            "primary" : false,
> > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > >            "relocating_node" : null,
> > >            "shard" : 4,
> > >            "index" : "stats"
> > >          } ],
> > >          "5" : [ {
> > >            "state" : "STARTED",
> > >            "primary" : false,
> > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > >            "relocating_node" : null,
> > >    
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
In reply to this post by kimchy
Unfortunately the nodes mentioned originally have been stopped. But it
seems the new cluster is also having issues recovering. The work
directories have not changed in size for about 10 minutes.

Here is the routing table and nodes (thread dumps follow)

 "routing_table" : {
    "indices" : {
      "versions" : {
        "shards" : {
          "0" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "versions"
          } ],
          "1" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 1,
            "index" : "versions"
          } ],
          "2" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 2,
            "index" : "versions"
          } ],
          "3" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 3,
            "index" : "versions"
          } ],
          "4" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 4,
            "index" : "versions"
          } ],
          "5" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "versions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 5,
            "index" : "versions"
          } ]
        }
      },
      "stats" : {
        "shards" : {
          "0" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "stats"
          } ],
          "1" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 1,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 1,
            "index" : "stats"
          } ],
          "2" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 2,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 2,
            "index" : "stats"
          } ],
          "3" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 3,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 3,
            "index" : "stats"
          } ],
          "4" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 4,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 4,
            "index" : "stats"
          } ],
          "5" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 5,
            "index" : "stats"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 5,
            "index" : "stats"
          } ]
        }
      },
      "revisions" : {
        "shards" : {
          "0" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 0,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 0,
            "index" : "revisions"
          } ],
          "1" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
            "relocating_node" : null,
            "shard" : 1,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 1,
            "index" : "revisions"
          } ],
          "2" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 2,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 2,
            "index" : "revisions"
          } ],
          "3" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
            "relocating_node" : null,
            "shard" : 3,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 3,
            "index" : "revisions"
          } ],
          "4" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
            "relocating_node" : null,
            "shard" : 4,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 4,
            "index" : "revisions"
          } ],
          "5" : [ {
            "state" : "INITIALIZING",
            "primary" : true,
            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
            "relocating_node" : null,
            "shard" : 5,
            "index" : "revisions"
          }, {
            "state" : "UNASSIGNED",
            "primary" : false,
            "node" : null,
            "relocating_node" : null,
            "shard" : 5,
            "index" : "revisions"
          } ]
        }
      }
    }
  },
  "routing_nodes" : {
    "unassigned" : [ {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 1,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 2,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 3,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 4,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 5,
      "index" : "versions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 1,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 1,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 2,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 2,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 3,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 3,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 4,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 4,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 5,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 5,
      "index" : "stats"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 0,
      "index" : "revisions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 1,
      "index" : "revisions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 2,
      "index" : "revisions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 3,
      "index" : "revisions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 4,
      "index" : "revisions"
    }, {
      "state" : "UNASSIGNED",
      "primary" : false,
      "node" : null,
      "relocating_node" : null,
      "shard" : 5,
      "index" : "revisions"
    } ],
    "nodes" : {
      "d9bd8b92-90d5-4892-bea8-46896b467974" : [ {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "stats"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 0,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 2,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "revisions"
      } ],
      "eeb2af20-13f7-41e3-9dca-30b768785b4f" : [ {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
        "relocating_node" : null,
        "shard" : 5,
        "index" : "versions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
        "relocating_node" : null,
        "shard" : 1,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
        "relocating_node" : null,
        "shard" : 3,
        "index" : "revisions"
      }, {
        "state" : "INITIALIZING",
        "primary" : true,
        "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
        "relocating_node" : null,
        "shard" : 4,
        "index" : "revisions"
      } ]
    }


Here is a thread dump from node eeb2af20-13f7-41e3-9dca-30b768785b4f

2010-08-12_19:35:09.31077    java.lang.Thread.State: TIMED_WAITING
(parking)
2010-08-12_19:35:09.31079       at sun.misc.Unsafe.park(Native Method)
2010-08-12_19:35:09.31080       - parking to wait for
<0x000000075bf09e48> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
2010-08-12_19:35:09.31080       at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
2010-08-12_19:35:09.31081       at
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
2010-08-12_19:35:09.31081       at
java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
2010-08-12_19:35:09.31082       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
2010-08-12_19:35:09.31083       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
2010-08-12_19:35:09.31084       at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:
1043)
2010-08-12_19:35:09.31084       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1103)
2010-08-12_19:35:09.31085       at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
2010-08-12_19:35:09.31085       at java.lang.Thread.run(Thread.java:
636)
2010-08-12_19:35:09.31086
2010-08-12_19:35:09.31086 "elasticsearch[Thumb, Tom][sc]-pool-2-
thread-2" daemon prio=10 tid=0x0000000001f8e800 nid=0x653b waiting on
condition [0x00007fda1441f000]
2010-08-12_19:35:09.31087    java.lang.Thread.State: TIMED_WAITING
(parking)
2010-08-12_19:35:09.31088       at sun.misc.Unsafe.park(Native Method)
2010-08-12_19:35:09.31088       - parking to wait for
<0x000000075bf09e48> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
2010-08-12_19:35:09.31089       at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
2010-08-12_19:35:09.31089       at
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
2010-08-12_19:35:09.31091       at
java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
2010-08-12_19:35:09.31091       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
2010-08-12_19:35:09.31092       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
2010-08-12_19:35:09.31092       at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:
1043)
2010-08-12_19:35:09.31093       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1103)
2010-08-12_19:35:09.31093       at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
2010-08-12_19:35:09.31094       at java.lang.Thread.run(Thread.java:
636)
2010-08-12_19:35:09.31094
2010-08-12_19:35:09.31095 "elasticsearch[Thumb, Tom][sc]-pool-2-
thread-1" daemon prio=10 tid=0x00007fda101c6000 nid=0x653a waiting on
condition [0x00007fda14af4000]
2010-08-12_19:35:09.31095    java.lang.Thread.State: TIMED_WAITING
(parking)
2010-08-12_19:35:09.31096       at sun.misc.Unsafe.park(Native Method)
2010-08-12_19:35:09.31096       - parking to wait for
<0x000000075bf09e48> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
2010-08-12_19:35:09.31097       at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
2010-08-12_19:35:09.31097       at
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
2010-08-12_19:35:09.31099       at
java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
2010-08-12_19:35:09.31099       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
2010-08-12_19:35:09.31100       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
2010-08-12_19:35:09.31100       at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:
1043)
2010-08-12_19:35:09.31101       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1103)
2010-08-12_19:35:09.31102       at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
2010-08-12_19:35:09.31103       at java.lang.Thread.run(Thread.java:
636)
2010-08-12_19:35:09.31103
2010-08-12_19:35:09.31104 "Low Memory Detector" daemon prio=10
tid=0x0000000001664000 nid=0x6532 runnable [0x0000000000000000]
2010-08-12_19:35:09.31104    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31105
2010-08-12_19:35:09.31105 "CompilerThread1" daemon prio=10
tid=0x0000000001661800 nid=0x6531 waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31106    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31106
2010-08-12_19:35:09.31107 "CompilerThread0" daemon prio=10
tid=0x000000000165b000 nid=0x6530 waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31108    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31109
2010-08-12_19:35:09.31109 "Signal Dispatcher" daemon prio=10
tid=0x0000000001659000 nid=0x652f waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31110    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31110
2010-08-12_19:35:09.31110 "Surrogate Locker Thread (CMS)" daemon
prio=10 tid=0x0000000001657800 nid=0x652e waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31111    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31111
2010-08-12_19:35:09.31112 "Finalizer" daemon prio=10
tid=0x0000000001635800 nid=0x652d in Object.wait()
[0x00007fda152c4000]
2010-08-12_19:35:09.31112    java.lang.Thread.State: WAITING (on
object monitor)
2010-08-12_19:35:09.31113       at java.lang.Object.wait(Native
Method)
2010-08-12_19:35:09.31113       - waiting on <0x000000075bd900c0> (a
java.lang.ref.ReferenceQueue$Lock)
2010-08-12_19:35:09.31114       at
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
2010-08-12_19:35:09.31114       - locked <0x000000075bd900c0> (a
java.lang.ref.ReferenceQueue$Lock)
2010-08-12_19:35:09.31115       at
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
2010-08-12_19:35:09.31115       at java.lang.ref.Finalizer
$FinalizerThread.run(Finalizer.java:177)
2010-08-12_19:35:09.31116
2010-08-12_19:35:09.31116 "Reference Handler" daemon prio=10
tid=0x0000000001633800 nid=0x652c in Object.wait()
[0x00007fda153c5000]
2010-08-12_19:35:09.31118    java.lang.Thread.State: WAITING (on
object monitor)
2010-08-12_19:35:09.31118       at java.lang.Object.wait(Native
Method)
2010-08-12_19:35:09.31119       - waiting on <0x000000075bd900f0> (a
java.lang.ref.Reference$Lock)
2010-08-12_19:35:09.31119       at java.lang.Object.wait(Object.java:
502)
2010-08-12_19:35:09.31120       at java.lang.ref.Reference
$ReferenceHandler.run(Reference.java:133)
2010-08-12_19:35:09.31120       - locked <0x000000075bd900f0> (a
java.lang.ref.Reference$Lock)
2010-08-12_19:35:09.31121
2010-08-12_19:35:09.31121 "VM Thread" prio=10 tid=0x000000000162f000
nid=0x652b runnable
2010-08-12_19:35:09.31122
2010-08-12_19:35:09.31122 "Gang worker#0 (Parallel GC Threads)"
prio=10 tid=0x000000000156b800 nid=0x6528 runnable
2010-08-12_19:35:09.31123
2010-08-12_19:35:09.31123 "Gang worker#1 (Parallel GC Threads)"
prio=10 tid=0x000000000156d000 nid=0x6529 runnable
2010-08-12_19:35:09.31124
2010-08-12_19:35:09.31124 "Concurrent Mark-Sweep GC Thread" prio=10
tid=0x00000000015c2800 nid=0x652a runnable
2010-08-12_19:35:09.31125 "VM Periodic Task Thread" prio=10
tid=0x0000000001667000 nid=0x6533 waiting on condition
2010-08-12_19:35:09.31125
2010-08-12_19:35:09.31125 JNI global references: 767
2010-08-12_19:35:09.31126
2010-08-12_19:35:09.31126 Heap
2010-08-12_19:35:09.31127  par new generation   total 19136K, used
12115K [0x0000000759400000, 0x000000075a8c0000, 0x000000075bd90000)
2010-08-12_19:35:09.31127   eden space 17024K,  68% used
[0x0000000759400000, 0x0000000759f6c698, 0x000000075a4a0000)
2010-08-12_19:35:09.31129   from space 2112K,  19% used
[0x000000075a6b0000, 0x000000075a718820, 0x000000075a8c0000)
2010-08-12_19:35:09.31129   to   space 2112K,   0% used
[0x000000075a4a0000, 0x000000075a4a0000, 0x000000075a6b0000)
2010-08-12_19:35:09.31130  concurrent mark-sweep generation total
109824K, used 34250K [0x000000075bd90000, 0x00000007628d0000,
0x00000007f5800000)
2010-08-12_19:35:09.31131  concurrent-mark-sweep perm gen total
35396K, used 30694K [0x00000007f5800000, 0x00000007f7a91000,
0x0000000800000000)
2010-08-12_19:35:09.31132


On Aug 12, 12:03 pm, Shay Banon <[hidden email]> wrote:

> It depends on the actual implementation. In general, the time it takes to do
> recovery is how fast we can get data from s3 when a full recovery from it is
> required (elasticsearch tries to minimize that by reusing the work dir,
> though you should set the recover_after_nodes setting).
>
> My concern is that maybe the recovery process is stuck somewhere. Not sure
> where though... . Is there a chance for a thread dump on one of the nodes
> that have a primary shard in INITIALIZING state? You can do that with
> visualvm.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 9:50 PM, Grant Rodgers <[hidden email]> wrote:
> > Yes, we are using the s3 gateway and version 0.9.0.
>
> > Would we get better gateway performance by using the fs gateway and a
> > network file system, like gluster, drbd, or ceph?
>
> > On Aug 12, 11:34 am, Shay Banon <[hidden email]> wrote:
> > > If I remember correctly, you use the s3 gateway, right? It seems like
> > some
> > > primary shards are stuck initializing (basically, recovering from the
> > > gateway).  I forgot to ask, which version are you using?
>
> > > -shay.banon
>
> > > On Thu, Aug 12, 2010 at 9:01 PM, Grant Rodgers <[hidden email]> wrote:
> > > > Fyi, 19 hours later, there is no change in the state of the cluster,
> > > > and the logs are still bouncing between created and recovering. We
> > > > will delete and recreate the missing indices, but it seems like
> > > > recovery should have happened by now. What is the expected recovery
> > > > time for a shard with a few hundred thousand documents?
>
> > > > Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
> > > > b6457780f145 node had been restarted a few times, and maybe that
> > > > caused some shards to get stuck in the relocating state? Still, I'd
> > > > expect that at least one shard would have recovered by now.
>
> > > > "routing_table" : {
> > > >    "indices" : {
> > > >      "versions" : {
> > > >        "shards" : {
> > > >          "0" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "1" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "2" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "3" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "4" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "5" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "versions"
> > > >          } ]
> > > >        }
> > > >      },
> > > >      "stats" : {
> > > >        "shards" : {
> > > >          "0" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "1" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "2" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "3" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "4" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "5" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" :
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
Hi,

  Let me try and explain a bit how gateway recovery works. An index can have several shards each with 0 or more replicas. When you start your nodes fresh (you should use recover_after_nodes setting for optimal allocation), then first the primary shards are allocated to the nodes. They move to a cluster wide state of INITIALIZING and start recover from the gateway. In order no to create very high load on a node, only X number of shard recoveries are allowed per node (defaults to the number of processors / cores). Also, the number of open streams (a stream is, for example, fetching a specific file) is throttled and also defaults to the number of processors / cores.

  The gateway recovery process itself first fetches all the index files from the gateway, and then fetches the transaction log and applies all the operations within it to the index.

   It seems like all the shards on the primary shards are in INITIALIZING state on the nodes and being recovered from the gateway. The work directory should definitely grow over time, unless it got into the applying the translog operations to the index, where there is in memory buffering done during indexing time until it is flushed to disk.

   Are the shards still not recovered? Its really annoying that you can't really tell the recovery progress through an API. I will add it for 0.9.1. This should simplify tracking down whats going on.

-shay.banon

On Thu, Aug 12, 2010 at 10:39 PM, Grant Rodgers <[hidden email]> wrote:
Unfortunately the nodes mentioned originally have been stopped. But it
seems the new cluster is also having issues recovering. The work
directories have not changed in size for about 10 minutes.

Here is the routing table and nodes (thread dumps follow)

 "routing_table" : {
   "indices" : {
     "versions" : {
       "shards" : {
         "0" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 0,
           "index" : "versions"
         } ],
         "1" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 1,
           "index" : "versions"
         } ],
         "2" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 2,
           "index" : "versions"
         } ],
         "3" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 3,
           "index" : "versions"
         } ],
         "4" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 4,
           "index" : "versions"
         } ],
         "5" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "versions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 5,
           "index" : "versions"
         } ]
       }
     },
     "stats" : {
       "shards" : {
         "0" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 0,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 0,
           "index" : "stats"
         } ],
         "1" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 1,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 1,
           "index" : "stats"
         } ],
         "2" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 2,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 2,
           "index" : "stats"
         } ],
         "3" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 3,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 3,
           "index" : "stats"
         } ],
         "4" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 4,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 4,
           "index" : "stats"
         } ],
         "5" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 5,
           "index" : "stats"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 5,
           "index" : "stats"
         } ]
       }
     },
     "revisions" : {
       "shards" : {
         "0" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 0,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 0,
           "index" : "revisions"
         } ],
         "1" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
           "relocating_node" : null,
           "shard" : 1,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 1,
           "index" : "revisions"
         } ],
         "2" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 2,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 2,
           "index" : "revisions"
         } ],
         "3" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
           "relocating_node" : null,
           "shard" : 3,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 3,
           "index" : "revisions"
         } ],
         "4" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
           "relocating_node" : null,
           "shard" : 4,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 4,
           "index" : "revisions"
         } ],
         "5" : [ {
           "state" : "INITIALIZING",
           "primary" : true,
           "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
           "relocating_node" : null,
           "shard" : 5,
           "index" : "revisions"
         }, {
           "state" : "UNASSIGNED",
           "primary" : false,
           "node" : null,
           "relocating_node" : null,
           "shard" : 5,
           "index" : "revisions"
         } ]
       }
     }
   }
 },
 "routing_nodes" : {
   "unassigned" : [ {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 0,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 1,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 2,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 3,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 4,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 5,
     "index" : "versions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 0,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 0,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 1,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 1,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 2,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 2,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 3,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 3,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 4,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 4,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 5,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 5,
     "index" : "stats"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 0,
     "index" : "revisions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 1,
     "index" : "revisions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 2,
     "index" : "revisions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 3,
     "index" : "revisions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 4,
     "index" : "revisions"
   }, {
     "state" : "UNASSIGNED",
     "primary" : false,
     "node" : null,
     "relocating_node" : null,
     "shard" : 5,
     "index" : "revisions"
   } ],
   "nodes" : {
     "d9bd8b92-90d5-4892-bea8-46896b467974" : [ {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "stats"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 0,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 2,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "revisions"
     } ],
     "eeb2af20-13f7-41e3-9dca-30b768785b4f" : [ {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
       "relocating_node" : null,
       "shard" : 5,
       "index" : "versions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
       "relocating_node" : null,
       "shard" : 1,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
       "relocating_node" : null,
       "shard" : 3,
       "index" : "revisions"
     }, {
       "state" : "INITIALIZING",
       "primary" : true,
       "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
       "relocating_node" : null,
       "shard" : 4,
       "index" : "revisions"
     } ]
   }


Here is a thread dump from node eeb2af20-13f7-41e3-9dca-30b768785b4f

2010-08-12_19:35:09.31077    java.lang.Thread.State: TIMED_WAITING
(parking)
2010-08-12_19:35:09.31079       at sun.misc.Unsafe.park(Native Method)
2010-08-12_19:35:09.31080       - parking to wait for
<0x000000075bf09e48> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
2010-08-12_19:35:09.31080       at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
2010-08-12_19:35:09.31081       at
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
2010-08-12_19:35:09.31081       at
java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
2010-08-12_19:35:09.31082       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
2010-08-12_19:35:09.31083       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
2010-08-12_19:35:09.31084       at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:
1043)
2010-08-12_19:35:09.31084       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1103)
2010-08-12_19:35:09.31085       at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
2010-08-12_19:35:09.31085       at java.lang.Thread.run(Thread.java:
636)
2010-08-12_19:35:09.31086
2010-08-12_19:35:09.31086 "elasticsearch[Thumb, Tom][sc]-pool-2-
thread-2" daemon prio=10 tid=0x0000000001f8e800 nid=0x653b waiting on
condition [0x00007fda1441f000]
2010-08-12_19:35:09.31087    java.lang.Thread.State: TIMED_WAITING
(parking)
2010-08-12_19:35:09.31088       at sun.misc.Unsafe.park(Native Method)
2010-08-12_19:35:09.31088       - parking to wait for
<0x000000075bf09e48> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
2010-08-12_19:35:09.31089       at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
2010-08-12_19:35:09.31089       at
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
2010-08-12_19:35:09.31091       at
java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
2010-08-12_19:35:09.31091       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
2010-08-12_19:35:09.31092       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
2010-08-12_19:35:09.31092       at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:
1043)
2010-08-12_19:35:09.31093       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1103)
2010-08-12_19:35:09.31093       at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
2010-08-12_19:35:09.31094       at java.lang.Thread.run(Thread.java:
636)
2010-08-12_19:35:09.31094
2010-08-12_19:35:09.31095 "elasticsearch[Thumb, Tom][sc]-pool-2-
thread-1" daemon prio=10 tid=0x00007fda101c6000 nid=0x653a waiting on
condition [0x00007fda14af4000]
2010-08-12_19:35:09.31095    java.lang.Thread.State: TIMED_WAITING
(parking)
2010-08-12_19:35:09.31096       at sun.misc.Unsafe.park(Native Method)
2010-08-12_19:35:09.31096       - parking to wait for
<0x000000075bf09e48> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
2010-08-12_19:35:09.31097       at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
2010-08-12_19:35:09.31097       at
java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2081)
2010-08-12_19:35:09.31099       at
java.util.concurrent.DelayQueue.take(DelayQueue.java:193)
2010-08-12_19:35:09.31099       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:688)
2010-08-12_19:35:09.31100       at
java.util.concurrent.ScheduledThreadPoolExecutor
$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:681)
2010-08-12_19:35:09.31100       at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:
1043)
2010-08-12_19:35:09.31101       at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1103)
2010-08-12_19:35:09.31102       at
java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
2010-08-12_19:35:09.31103       at java.lang.Thread.run(Thread.java:
636)
2010-08-12_19:35:09.31103
2010-08-12_19:35:09.31104 "Low Memory Detector" daemon prio=10
tid=0x0000000001664000 nid=0x6532 runnable [0x0000000000000000]
2010-08-12_19:35:09.31104    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31105
2010-08-12_19:35:09.31105 "CompilerThread1" daemon prio=10
tid=0x0000000001661800 nid=0x6531 waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31106    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31106
2010-08-12_19:35:09.31107 "CompilerThread0" daemon prio=10
tid=0x000000000165b000 nid=0x6530 waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31108    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31109
2010-08-12_19:35:09.31109 "Signal Dispatcher" daemon prio=10
tid=0x0000000001659000 nid=0x652f waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31110    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31110
2010-08-12_19:35:09.31110 "Surrogate Locker Thread (CMS)" daemon
prio=10 tid=0x0000000001657800 nid=0x652e waiting on condition
[0x0000000000000000]
2010-08-12_19:35:09.31111    java.lang.Thread.State: RUNNABLE
2010-08-12_19:35:09.31111
2010-08-12_19:35:09.31112 "Finalizer" daemon prio=10
tid=0x0000000001635800 nid=0x652d in Object.wait()
[0x00007fda152c4000]
2010-08-12_19:35:09.31112    java.lang.Thread.State: WAITING (on
object monitor)
2010-08-12_19:35:09.31113       at java.lang.Object.wait(Native
Method)
2010-08-12_19:35:09.31113       - waiting on <0x000000075bd900c0> (a
java.lang.ref.ReferenceQueue$Lock)
2010-08-12_19:35:09.31114       at
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
2010-08-12_19:35:09.31114       - locked <0x000000075bd900c0> (a
java.lang.ref.ReferenceQueue$Lock)
2010-08-12_19:35:09.31115       at
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
2010-08-12_19:35:09.31115       at java.lang.ref.Finalizer
$FinalizerThread.run(Finalizer.java:177)
2010-08-12_19:35:09.31116
2010-08-12_19:35:09.31116 "Reference Handler" daemon prio=10
tid=0x0000000001633800 nid=0x652c in Object.wait()
[0x00007fda153c5000]
2010-08-12_19:35:09.31118    java.lang.Thread.State: WAITING (on
object monitor)
2010-08-12_19:35:09.31118       at java.lang.Object.wait(Native
Method)
2010-08-12_19:35:09.31119       - waiting on <0x000000075bd900f0> (a
java.lang.ref.Reference$Lock)
2010-08-12_19:35:09.31119       at java.lang.Object.wait(Object.java:
502)
2010-08-12_19:35:09.31120       at java.lang.ref.Reference
$ReferenceHandler.run(Reference.java:133)
2010-08-12_19:35:09.31120       - locked <0x000000075bd900f0> (a
java.lang.ref.Reference$Lock)
2010-08-12_19:35:09.31121
2010-08-12_19:35:09.31121 "VM Thread" prio=10 tid=0x000000000162f000
nid=0x652b runnable
2010-08-12_19:35:09.31122
2010-08-12_19:35:09.31122 "Gang worker#0 (Parallel GC Threads)"
prio=10 tid=0x000000000156b800 nid=0x6528 runnable
2010-08-12_19:35:09.31123
2010-08-12_19:35:09.31123 "Gang worker#1 (Parallel GC Threads)"
prio=10 tid=0x000000000156d000 nid=0x6529 runnable
2010-08-12_19:35:09.31124
2010-08-12_19:35:09.31124 "Concurrent Mark-Sweep GC Thread" prio=10
tid=0x00000000015c2800 nid=0x652a runnable
2010-08-12_19:35:09.31125 "VM Periodic Task Thread" prio=10
tid=0x0000000001667000 nid=0x6533 waiting on condition
2010-08-12_19:35:09.31125
2010-08-12_19:35:09.31125 JNI global references: 767
2010-08-12_19:35:09.31126
2010-08-12_19:35:09.31126 Heap
2010-08-12_19:35:09.31127  par new generation   total 19136K, used
12115K [0x0000000759400000, 0x000000075a8c0000, 0x000000075bd90000)
2010-08-12_19:35:09.31127   eden space 17024K,  68% used
[0x0000000759400000, 0x0000000759f6c698, 0x000000075a4a0000)
2010-08-12_19:35:09.31129   from space 2112K,  19% used
[0x000000075a6b0000, 0x000000075a718820, 0x000000075a8c0000)
2010-08-12_19:35:09.31129   to   space 2112K,   0% used
[0x000000075a4a0000, 0x000000075a4a0000, 0x000000075a6b0000)
2010-08-12_19:35:09.31130  concurrent mark-sweep generation total
109824K, used 34250K [0x000000075bd90000, 0x00000007628d0000,
0x00000007f5800000)
2010-08-12_19:35:09.31131  concurrent-mark-sweep perm gen total
35396K, used 30694K [0x00000007f5800000, 0x00000007f7a91000,
0x0000000800000000)
2010-08-12_19:35:09.31132


On Aug 12, 12:03 pm, Shay Banon <[hidden email]> wrote:
> It depends on the actual implementation. In general, the time it takes to do
> recovery is how fast we can get data from s3 when a full recovery from it is
> required (elasticsearch tries to minimize that by reusing the work dir,
> though you should set the recover_after_nodes setting).
>
> My concern is that maybe the recovery process is stuck somewhere. Not sure
> where though... . Is there a chance for a thread dump on one of the nodes
> that have a primary shard in INITIALIZING state? You can do that with
> visualvm.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 9:50 PM, Grant Rodgers <[hidden email]> wrote:
> > Yes, we are using the s3 gateway and version 0.9.0.
>
> > Would we get better gateway performance by using the fs gateway and a
> > network file system, like gluster, drbd, or ceph?
>
> > On Aug 12, 11:34 am, Shay Banon <[hidden email]> wrote:
> > > If I remember correctly, you use the s3 gateway, right? It seems like
> > some
> > > primary shards are stuck initializing (basically, recovering from the
> > > gateway).  I forgot to ask, which version are you using?
>
> > > -shay.banon
>
> > > On Thu, Aug 12, 2010 at 9:01 PM, Grant Rodgers <[hidden email]> wrote:
> > > > Fyi, 19 hours later, there is no change in the state of the cluster,
> > > > and the logs are still bouncing between created and recovering. We
> > > > will delete and recreate the missing indices, but it seems like
> > > > recovery should have happened by now. What is the expected recovery
> > > > time for a shard with a few hundred thousand documents?
>
> > > > Here is the cluster state for reference. The dacbe3f4-ef13-43ba-aa13-
> > > > b6457780f145 node had been restarted a few times, and maybe that
> > > > caused some shards to get stuck in the relocating state? Still, I'd
> > > > expect that at least one shard would have recovered by now.
>
> > > > "routing_table" : {
> > > >    "indices" : {
> > > >      "versions" : {
> > > >        "shards" : {
> > > >          "0" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "1" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "2" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "3" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "4" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "5" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "versions"
> > > >          } ]
> > > >        }
> > > >      },
> > > >      "stats" : {
> > > >        "shards" : {
> > > >          "0" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "1" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "2" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "3" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "4" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : false,
> > > >            "node" : "dacbe3f4-ef13-43ba-aa13-b6457780f145",
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "stats"
> > > >          } ],
> > > >          "5" : [ {
> > > >            "state" : "STARTED",
> > > >            "primary" : false,
> > > >            "node" : "b411c443-ebdd-416a-ac35-cce5af38fa3f",
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" : "STARTED",
> > > >            "primary" : true,
> > > >            "node" : "a4eada23-ab66-4ab5-bf20-8de09e25a6a3",
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "stats"
> > > >          }, {
> > > >            "state" :
>
> ...
>
> read more »

Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
Ok, thanks for the explanation. Good to know about the throttling
parameters. We are now using recover_after_time: 1m to stabilize shard
allocation.

Maybe it would be worth it to prioritize local shard recovery since (I
assume) that can happen quicker than gateway recovery? I could be
completely wrong about how it works though.

I let the recovery process run until about 15 minutes ago, so roughly
3.5 hours. There was still no change in the size of the work
directories, and no shards had changed state out of INITIALIZING. It
is curious, because I think at least some of the gateway data had been
transferred, so I think the recovery started ok, but maybe at some
point got stuck.

We decided to just delete the versions and revisions indices and
recreate them. Now I am watching the recovery process of the other
index, stats. it seems to be progressing smoothly. Maybe there was
something unusual about the other indexes that causes the recovery
process to get wedged.

An API to watch recovery status would be super useful, looking forward
to that.

Grant

On Aug 12, 3:14 pm, Shay Banon <[hidden email]> wrote:

> Hi,
>
>   Let me try and explain a bit how gateway recovery works. An index can have
> several shards each with 0 or more replicas. When you start your nodes fresh
> (you should use recover_after_nodes setting for optimal allocation), then
> first the primary shards are allocated to the nodes. They move to a cluster
> wide state of INITIALIZING and start recover from the gateway. In order no
> to create very high load on a node, only X number of shard recoveries are
> allowed per node (defaults to the number of processors / cores). Also, the
> number of open streams (a stream is, for example, fetching a specific file)
> is throttled and also defaults to the number of processors / cores.
>
>   The gateway recovery process itself first fetches all the index files from
> the gateway, and then fetches the transaction log and applies all the
> operations within it to the index.
>
>    It seems like all the shards on the primary shards are in INITIALIZING
> state on the nodes and being recovered from the gateway. The work directory
> should definitely grow over time, unless it got into the applying the
> translog operations to the index, where there is in memory buffering done
> during indexing time until it is flushed to disk.
>
>    Are the shards still not recovered? Its really annoying that you can't
> really tell the recovery progress through an API. I will add it for 0.9.1.
> This should simplify tracking down whats going on.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 10:39 PM, Grant Rodgers <[hidden email]> wrote:
> > Unfortunately the nodes mentioned originally have been stopped. But it
> > seems the new cluster is also having issues recovering. The work
> > directories have not changed in size for about 10 minutes.
>
> > Here is the routing table and nodes (thread dumps follow)
>
> >  "routing_table" : {
> >    "indices" : {
> >      "versions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          } ],
> >          "3" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          } ],
> >          "4" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          } ],
> >          "5" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          } ]
> >        }
> >      },
> >      "stats" : {
> >        "shards" : {
> >          "0" : [ {
> >             "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >             "index" : "stats"
> >           } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >             "index" : "stats"
> >           } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >             "index" : "stats"
> >           } ],
> >          "3" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >             "index" : "stats"
> >           } ],
> >          "4" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >             "index" : "stats"
> >           } ],
> >          "5" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >             "index" : "stats"
> >          } ]
> >        }
> >      },
> >      "revisions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          } ],
> >          "2" : [ {
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
Hi,

   There isn't really any local shard recovery. The "reuse work location" feature works by allocating shards to nodes that have the smallest diff (diff between the local work storage for that shard before it was shutdown and the gateway). The recovery process for the shard, once its allocated to that node, only fetches the changes between the local storage and the gateway. This, when issuing complete cluster shutdown (and always make sure you do it using the shutdown API), and not a complete disaster where the disks are gone, can greatly reduce the time it takes to recover from the gateway.

   The only place that I can think where it might get stuck is in the API to s3. I am using the official AWS SDK for that, I will check where this might happen. I did not see it in the thread dump you sent, but I forgot to mention that it looks suspiciously small on threads... (funnily enough, the AWS S3 gateway is probably the only place in elasticsearch where thread dump might make sense, since they use blocking IO).

   I am sorry that this happened. I will work on the recovery status API and ping you once its done. This way it will be easier to tell at least where its stuck. I assume there were no errors in the logs, right?

-shay.banon

On Fri, Aug 13, 2010 at 2:05 AM, Grant Rodgers <[hidden email]> wrote:
Ok, thanks for the explanation. Good to know about the throttling
parameters. We are now using recover_after_time: 1m to stabilize shard
allocation.

Maybe it would be worth it to prioritize local shard recovery since (I
assume) that can happen quicker than gateway recovery? I could be
completely wrong about how it works though.

I let the recovery process run until about 15 minutes ago, so roughly
3.5 hours. There was still no change in the size of the work
directories, and no shards had changed state out of INITIALIZING. It
is curious, because I think at least some of the gateway data had been
transferred, so I think the recovery started ok, but maybe at some
point got stuck.

We decided to just delete the versions and revisions indices and
recreate them. Now I am watching the recovery process of the other
index, stats. it seems to be progressing smoothly. Maybe there was
something unusual about the other indexes that causes the recovery
process to get wedged.

An API to watch recovery status would be super useful, looking forward
to that.

Grant

On Aug 12, 3:14 pm, Shay Banon <[hidden email]> wrote:
> Hi,
>
>   Let me try and explain a bit how gateway recovery works. An index can have
> several shards each with 0 or more replicas. When you start your nodes fresh
> (you should use recover_after_nodes setting for optimal allocation), then
> first the primary shards are allocated to the nodes. They move to a cluster
> wide state of INITIALIZING and start recover from the gateway. In order no
> to create very high load on a node, only X number of shard recoveries are
> allowed per node (defaults to the number of processors / cores). Also, the
> number of open streams (a stream is, for example, fetching a specific file)
> is throttled and also defaults to the number of processors / cores.
>
>   The gateway recovery process itself first fetches all the index files from
> the gateway, and then fetches the transaction log and applies all the
> operations within it to the index.
>
>    It seems like all the shards on the primary shards are in INITIALIZING
> state on the nodes and being recovered from the gateway. The work directory
> should definitely grow over time, unless it got into the applying the
> translog operations to the index, where there is in memory buffering done
> during indexing time until it is flushed to disk.
>
>    Are the shards still not recovered? Its really annoying that you can't
> really tell the recovery progress through an API. I will add it for 0.9.1.
> This should simplify tracking down whats going on.
>
> -shay.banon
>
> On Thu, Aug 12, 2010 at 10:39 PM, Grant Rodgers <[hidden email]> wrote:
> > Unfortunately the nodes mentioned originally have been stopped. But it
> > seems the new cluster is also having issues recovering. The work
> > directories have not changed in size for about 10 minutes.
>
> > Here is the routing table and nodes (thread dumps follow)
>
> >  "routing_table" : {
> >    "indices" : {
> >      "versions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "versions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "versions"
> >          } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "versions"
> >          } ],
> >          "3" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "versions"
> >          } ],
> >          "4" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "versions"
> >          } ],
> >          "5" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "versions"
> >          } ]
> >        }
> >      },
> >      "stats" : {
> >        "shards" : {
> >          "0" : [ {
> >             "state" : "INITIALIZING",
> >            "primary" : true,
> >            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >             "index" : "stats"
> >           } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >             "index" : "stats"
> >           } ],
> >          "2" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 2,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 2,
> >             "index" : "stats"
> >           } ],
> >          "3" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 3,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 3,
> >             "index" : "stats"
> >           } ],
> >          "4" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 4,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 4,
> >             "index" : "stats"
> >           } ],
> >          "5" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 5,
> >            "index" : "stats"
> >          }, {
> >             "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >             "index" : "stats"
> >           }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 5,
> >             "index" : "stats"
> >          } ]
> >        }
> >      },
> >      "revisions" : {
> >        "shards" : {
> >          "0" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> >             "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 0,
> >            "index" : "revisions"
> >          } ],
> >          "1" : [ {
> >            "state" : "INITIALIZING",
> >            "primary" : true,
> >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> >             "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          }, {
> >            "state" : "UNASSIGNED",
> >            "primary" : false,
> >            "node" : null,
> >            "relocating_node" : null,
> >            "shard" : 1,
> >            "index" : "revisions"
> >          } ],
> >          "2" : [ {
>
> ...
>
> read more »

Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
I haven't seen any errors in the log, but there have been a few
warnings.

[00:32:39,475][WARN ][index.gateway.s3         ] [Sunset Bain] [stats]
[0] no file [_91q0.cfs] to recover, even though it has md5, ignoring
it
[00:32:45,591][WARN ][index.gateway.s3         ] [Sunset Bain] [stats]
[0] no file [_5k99.cfs] to recover, even though it has md5, ignoring
it

Right now, for our stats index, shard 0 appears to be stuck in the
same way as previous attempts. In an attempt to determine whether it
was a corrupted shard or some transient error, I stopped that node and
let the shard recover on a different node. It is still initializing on
that node, and that's where I saw those warning messages.

Could it be an issue with the gateway data? Maybe something can happen
to a gateway snapshot that makes it unrecoverable.

Thanks for all the quick responses and advice!

On Aug 12, 4:13 pm, Shay Banon <[hidden email]> wrote:

> Hi,
>
>    There isn't really any local shard recovery. The "reuse work location"
> feature works by allocating shards to nodes that have the smallest diff
> (diff between the local work storage for that shard before it was shutdown
> and the gateway). The recovery process for the shard, once its allocated to
> that node, only fetches the changes between the local storage and the
> gateway. This, when issuing complete cluster shutdown (and always make sure
> you do it using the shutdown API), and not a complete disaster where the
> disks are gone, can greatly reduce the time it takes to recover from the
> gateway.
>
>    The only place that I can think where it might get stuck is in the API to
> s3. I am using the official AWS SDK for that, I will check where this might
> happen. I did not see it in the thread dump you sent, but I forgot to
> mention that it looks suspiciously small on threads... (funnily enough, the
> AWS S3 gateway is probably the only place in elasticsearch where thread dump
> might make sense, since they use blocking IO).
>
>    I am sorry that this happened. I will work on the recovery status API and
> ping you once its done. This way it will be easier to tell at least where
> its stuck. I assume there were no errors in the logs, right?
>
> -shay.banon
>
> On Fri, Aug 13, 2010 at 2:05 AM, Grant Rodgers <[hidden email]> wrote:
> > Ok, thanks for the explanation. Good to know about the throttling
> > parameters. We are now using recover_after_time: 1m to stabilize shard
> > allocation.
>
> > Maybe it would be worth it to prioritize local shard recovery since (I
> > assume) that can happen quicker than gateway recovery? I could be
> > completely wrong about how it works though.
>
> > I let the recovery process run until about 15 minutes ago, so roughly
> > 3.5 hours. There was still no change in the size of the work
> > directories, and no shards had changed state out of INITIALIZING. It
> > is curious, because I think at least some of the gateway data had been
> > transferred, so I think the recovery started ok, but maybe at some
> > point got stuck.
>
> > We decided to just delete the versions and revisions indices and
> > recreate them. Now I am watching the recovery process of the other
> > index, stats. it seems to be progressing smoothly. Maybe there was
> > something unusual about the other indexes that causes the recovery
> > process to get wedged.
>
> > An API to watch recovery status would be super useful, looking forward
> > to that.
>
> > Grant
>
> > On Aug 12, 3:14 pm, Shay Banon <[hidden email]> wrote:
> > > Hi,
>
> > >   Let me try and explain a bit how gateway recovery works. An index can
> > have
> > > several shards each with 0 or more replicas. When you start your nodes
> > fresh
> > > (you should use recover_after_nodes setting for optimal allocation), then
> > > first the primary shards are allocated to the nodes. They move to a
> > cluster
> > > wide state of INITIALIZING and start recover from the gateway. In order
> > no
> > > to create very high load on a node, only X number of shard recoveries are
> > > allowed per node (defaults to the number of processors / cores). Also,
> > the
> > > number of open streams (a stream is, for example, fetching a specific
> > file)
> > > is throttled and also defaults to the number of processors / cores.
>
> > >   The gateway recovery process itself first fetches all the index files
> > from
> > > the gateway, and then fetches the transaction log and applies all the
> > > operations within it to the index.
>
> > >    It seems like all the shards on the primary shards are in INITIALIZING
> > > state on the nodes and being recovered from the gateway. The work
> > directory
> > > should definitely grow over time, unless it got into the applying the
> > > translog operations to the index, where there is in memory buffering done
> > > during indexing time until it is flushed to disk.
>
> > >    Are the shards still not recovered? Its really annoying that you can't
> > > really tell the recovery progress through an API. I will add it for
> > 0.9.1.
> > > This should simplify tracking down whats going on.
>
> > > -shay.banon
>
> > > On Thu, Aug 12, 2010 at 10:39 PM, Grant Rodgers <[hidden email]>
> > wrote:
> > > > Unfortunately the nodes mentioned originally have been stopped. But it
> > > > seems the new cluster is also having issues recovering. The work
> > > > directories have not changed in size for about 10 minutes.
>
> > > > Here is the routing table and nodes (thread dumps follow)
>
> > > >  "routing_table" : {
> > > >    "indices" : {
> > > >      "versions" : {
> > > >        "shards" : {
> > > >          "0" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > >             "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "1" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > >             "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "2" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > >             "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "3" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > >             "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 3,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "4" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > >             "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 4,
> > > >            "index" : "versions"
> > > >          } ],
> > > >          "5" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > >             "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "versions"
> > > >          }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 5,
> > > >            "index" : "versions"
> > > >          } ]
> > > >        }
> > > >      },
> > > >      "stats" : {
> > > >        "shards" : {
> > > >          "0" : [ {
> > > >             "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >            "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > >             "relocating_node" : null,
> > > >            "shard" : 0,
> > > >            "index" : "stats"
> > > >          }, {
> > > >             "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >             "index" : "stats"
> > > >           }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 0,
> > > >             "index" : "stats"
> > > >           } ],
> > > >          "1" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > >             "relocating_node" : null,
> > > >            "shard" : 1,
> > > >            "index" : "stats"
> > > >          }, {
> > > >             "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >             "index" : "stats"
> > > >           }, {
> > > >            "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 1,
> > > >             "index" : "stats"
> > > >           } ],
> > > >          "2" : [ {
> > > >            "state" : "INITIALIZING",
> > > >            "primary" : true,
> > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > >             "relocating_node" : null,
> > > >            "shard" : 2,
> > > >            "index" : "stats"
> > > >          }, {
> > > >             "state" : "UNASSIGNED",
> > > >            "primary" : false,
> > > >            "node" : null,
> > > >            "relocating_node" : null,
> > > >            "shard" : 2,
> > > >             "index" : "stats"
> > > >           }, {
> > > >            "state" : "UNASSIGNED",
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

Grant Rodgers
This was fixed by http://github.com/elasticsearch/elasticsearch/issues/closed#issue/318.
Today our index recovered successfully from s3.

Thanks for the quick turnaround! One day from report to fix - not many
projects can do that.

On Aug 12, 5:56 pm, Grant Rodgers <[hidden email]> wrote:

> I haven't seen any errors in the log, but there have been a few
> warnings.
>
> [00:32:39,475][WARN ][index.gateway.s3         ] [Sunset Bain] [stats]
> [0] no file [_91q0.cfs] to recover, even though it has md5, ignoring
> it
> [00:32:45,591][WARN ][index.gateway.s3         ] [Sunset Bain] [stats]
> [0] no file [_5k99.cfs] to recover, even though it has md5, ignoring
> it
>
> Right now, for our stats index, shard 0 appears to be stuck in the
> same way as previous attempts. In an attempt to determine whether it
> was a corrupted shard or some transient error, I stopped that node and
> let the shard recover on a different node. It is still initializing on
> that node, and that's where I saw those warning messages.
>
> Could it be an issue with the gateway data? Maybe something can happen
> to a gateway snapshot that makes it unrecoverable.
>
> Thanks for all the quick responses and advice!
>
> On Aug 12, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
> > Hi,
>
> >    There isn't really any local shard recovery. The "reuse work location"
> > feature works by allocating shards to nodes that have the smallest diff
> > (diff between the local work storage for that shard before it was shutdown
> > and the gateway). The recovery process for the shard, once its allocated to
> > that node, only fetches the changes between the local storage and the
> > gateway. This, when issuing complete cluster shutdown (and always make sure
> > you do it using the shutdown API), and not a complete disaster where the
> > disks are gone, can greatly reduce the time it takes to recover from the
> > gateway.
>
> >    The only place that I can think where it might get stuck is in the API to
> > s3. I am using the official AWS SDK for that, I will check where this might
> > happen. I did not see it in the thread dump you sent, but I forgot to
> > mention that it looks suspiciously small on threads... (funnily enough, the
> > AWS S3 gateway is probably the only place in elasticsearch where thread dump
> > might make sense, since they use blocking IO).
>
> >    I am sorry that this happened. I will work on the recovery status API and
> > ping you once its done. This way it will be easier to tell at least where
> > its stuck. I assume there were no errors in the logs, right?
>
> > -shay.banon
>
> > On Fri, Aug 13, 2010 at 2:05 AM, Grant Rodgers <[hidden email]> wrote:
> > > Ok, thanks for the explanation. Good to know about the throttling
> > > parameters. We are now using recover_after_time: 1m to stabilize shard
> > > allocation.
>
> > > Maybe it would be worth it to prioritize local shard recovery since (I
> > > assume) that can happen quicker than gateway recovery? I could be
> > > completely wrong about how it works though.
>
> > > I let the recovery process run until about 15 minutes ago, so roughly
> > > 3.5 hours. There was still no change in the size of the work
> > > directories, and no shards had changed state out of INITIALIZING. It
> > > is curious, because I think at least some of the gateway data had been
> > > transferred, so I think the recovery started ok, but maybe at some
> > > point got stuck.
>
> > > We decided to just delete the versions and revisions indices and
> > > recreate them. Now I am watching the recovery process of the other
> > > index, stats. it seems to be progressing smoothly. Maybe there was
> > > something unusual about the other indexes that causes the recovery
> > > process to get wedged.
>
> > > An API to watch recovery status would be super useful, looking forward
> > > to that.
>
> > > Grant
>
> > > On Aug 12, 3:14 pm, Shay Banon <[hidden email]> wrote:
> > > > Hi,
>
> > > >   Let me try and explain a bit how gateway recovery works. An index can
> > > have
> > > > several shards each with 0 or more replicas. When you start your nodes
> > > fresh
> > > > (you should use recover_after_nodes setting for optimal allocation), then
> > > > first the primary shards are allocated to the nodes. They move to a
> > > cluster
> > > > wide state of INITIALIZING and start recover from the gateway. In order
> > > no
> > > > to create very high load on a node, only X number of shard recoveries are
> > > > allowed per node (defaults to the number of processors / cores). Also,
> > > the
> > > > number of open streams (a stream is, for example, fetching a specific
> > > file)
> > > > is throttled and also defaults to the number of processors / cores.
>
> > > >   The gateway recovery process itself first fetches all the index files
> > > from
> > > > the gateway, and then fetches the transaction log and applies all the
> > > > operations within it to the index.
>
> > > >    It seems like all the shards on the primary shards are in INITIALIZING
> > > > state on the nodes and being recovered from the gateway. The work
> > > directory
> > > > should definitely grow over time, unless it got into the applying the
> > > > translog operations to the index, where there is in memory buffering done
> > > > during indexing time until it is flushed to disk.
>
> > > >    Are the shards still not recovered? Its really annoying that you can't
> > > > really tell the recovery progress through an API. I will add it for
> > > 0.9.1.
> > > > This should simplify tracking down whats going on.
>
> > > > -shay.banon
>
> > > > On Thu, Aug 12, 2010 at 10:39 PM, Grant Rodgers <[hidden email]>
> > > wrote:
> > > > > Unfortunately the nodes mentioned originally have been stopped. But it
> > > > > seems the new cluster is also having issues recovering. The work
> > > > > directories have not changed in size for about 10 minutes.
>
> > > > > Here is the routing table and nodes (thread dumps follow)
>
> > > > >  "routing_table" : {
> > > > >    "indices" : {
> > > > >      "versions" : {
> > > > >        "shards" : {
> > > > >          "0" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 0,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 0,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "1" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 1,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 1,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "2" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 2,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 2,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "3" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 3,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 3,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "4" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 4,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 4,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "5" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 5,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 5,
> > > > >            "index" : "versions"
> > > > >          } ]
> > > > >        }
> > > > >      },
> > > > >      "stats" : {
> > > > >        "shards" : {
> > > > >          "0" : [ {
> > > > >             "state" : "INITIALIZING",
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: apparent infinite loop in node recovery

kimchy
Administrator
Great that its fixed! This bug will certainly push the timeline for 0.9.1 release to a closer date (probably sometime next week).

-shay.banon

On Sat, Aug 14, 2010 at 12:59 AM, Grant Rodgers <[hidden email]> wrote:
This was fixed by http://github.com/elasticsearch/elasticsearch/issues/closed#issue/318.
Today our index recovered successfully from s3.

Thanks for the quick turnaround! One day from report to fix - not many
projects can do that.

On Aug 12, 5:56 pm, Grant Rodgers <[hidden email]> wrote:
> I haven't seen any errors in the log, but there have been a few
> warnings.
>
> [00:32:39,475][WARN ][index.gateway.s3         ] [Sunset Bain] [stats]
> [0] no file [_91q0.cfs] to recover, even though it has md5, ignoring
> it
> [00:32:45,591][WARN ][index.gateway.s3         ] [Sunset Bain] [stats]
> [0] no file [_5k99.cfs] to recover, even though it has md5, ignoring
> it
>
> Right now, for our stats index, shard 0 appears to be stuck in the
> same way as previous attempts. In an attempt to determine whether it
> was a corrupted shard or some transient error, I stopped that node and
> let the shard recover on a different node. It is still initializing on
> that node, and that's where I saw those warning messages.
>
> Could it be an issue with the gateway data? Maybe something can happen
> to a gateway snapshot that makes it unrecoverable.
>
> Thanks for all the quick responses and advice!
>
> On Aug 12, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
> > Hi,
>
> >    There isn't really any local shard recovery. The "reuse work location"
> > feature works by allocating shards to nodes that have the smallest diff
> > (diff between the local work storage for that shard before it was shutdown
> > and the gateway). The recovery process for the shard, once its allocated to
> > that node, only fetches the changes between the local storage and the
> > gateway. This, when issuing complete cluster shutdown (and always make sure
> > you do it using the shutdown API), and not a complete disaster where the
> > disks are gone, can greatly reduce the time it takes to recover from the
> > gateway.
>
> >    The only place that I can think where it might get stuck is in the API to
> > s3. I am using the official AWS SDK for that, I will check where this might
> > happen. I did not see it in the thread dump you sent, but I forgot to
> > mention that it looks suspiciously small on threads... (funnily enough, the
> > AWS S3 gateway is probably the only place in elasticsearch where thread dump
> > might make sense, since they use blocking IO).
>
> >    I am sorry that this happened. I will work on the recovery status API and
> > ping you once its done. This way it will be easier to tell at least where
> > its stuck. I assume there were no errors in the logs, right?
>
> > -shay.banon
>
> > On Fri, Aug 13, 2010 at 2:05 AM, Grant Rodgers <[hidden email]> wrote:
> > > Ok, thanks for the explanation. Good to know about the throttling
> > > parameters. We are now using recover_after_time: 1m to stabilize shard
> > > allocation.
>
> > > Maybe it would be worth it to prioritize local shard recovery since (I
> > > assume) that can happen quicker than gateway recovery? I could be
> > > completely wrong about how it works though.
>
> > > I let the recovery process run until about 15 minutes ago, so roughly
> > > 3.5 hours. There was still no change in the size of the work
> > > directories, and no shards had changed state out of INITIALIZING. It
> > > is curious, because I think at least some of the gateway data had been
> > > transferred, so I think the recovery started ok, but maybe at some
> > > point got stuck.
>
> > > We decided to just delete the versions and revisions indices and
> > > recreate them. Now I am watching the recovery process of the other
> > > index, stats. it seems to be progressing smoothly. Maybe there was
> > > something unusual about the other indexes that causes the recovery
> > > process to get wedged.
>
> > > An API to watch recovery status would be super useful, looking forward
> > > to that.
>
> > > Grant
>
> > > On Aug 12, 3:14 pm, Shay Banon <[hidden email]> wrote:
> > > > Hi,
>
> > > >   Let me try and explain a bit how gateway recovery works. An index can
> > > have
> > > > several shards each with 0 or more replicas. When you start your nodes
> > > fresh
> > > > (you should use recover_after_nodes setting for optimal allocation), then
> > > > first the primary shards are allocated to the nodes. They move to a
> > > cluster
> > > > wide state of INITIALIZING and start recover from the gateway. In order
> > > no
> > > > to create very high load on a node, only X number of shard recoveries are
> > > > allowed per node (defaults to the number of processors / cores). Also,
> > > the
> > > > number of open streams (a stream is, for example, fetching a specific
> > > file)
> > > > is throttled and also defaults to the number of processors / cores.
>
> > > >   The gateway recovery process itself first fetches all the index files
> > > from
> > > > the gateway, and then fetches the transaction log and applies all the
> > > > operations within it to the index.
>
> > > >    It seems like all the shards on the primary shards are in INITIALIZING
> > > > state on the nodes and being recovered from the gateway. The work
> > > directory
> > > > should definitely grow over time, unless it got into the applying the
> > > > translog operations to the index, where there is in memory buffering done
> > > > during indexing time until it is flushed to disk.
>
> > > >    Are the shards still not recovered? Its really annoying that you can't
> > > > really tell the recovery progress through an API. I will add it for
> > > 0.9.1.
> > > > This should simplify tracking down whats going on.
>
> > > > -shay.banon
>
> > > > On Thu, Aug 12, 2010 at 10:39 PM, Grant Rodgers <[hidden email]>
> > > wrote:
> > > > > Unfortunately the nodes mentioned originally have been stopped. But it
> > > > > seems the new cluster is also having issues recovering. The work
> > > > > directories have not changed in size for about 10 minutes.
>
> > > > > Here is the routing table and nodes (thread dumps follow)
>
> > > > >  "routing_table" : {
> > > > >    "indices" : {
> > > > >      "versions" : {
> > > > >        "shards" : {
> > > > >          "0" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 0,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 0,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "1" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 1,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 1,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "2" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 2,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 2,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "3" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 3,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 3,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "4" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "d9bd8b92-90d5-4892-bea8-46896b467974",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 4,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 4,
> > > > >            "index" : "versions"
> > > > >          } ],
> > > > >          "5" : [ {
> > > > >            "state" : "INITIALIZING",
> > > > >            "primary" : true,
> > > > >             "node" : "eeb2af20-13f7-41e3-9dca-30b768785b4f",
> > > > >             "relocating_node" : null,
> > > > >            "shard" : 5,
> > > > >            "index" : "versions"
> > > > >          }, {
> > > > >            "state" : "UNASSIGNED",
> > > > >            "primary" : false,
> > > > >            "node" : null,
> > > > >            "relocating_node" : null,
> > > > >            "shard" : 5,
> > > > >            "index" : "versions"
> > > > >          } ]
> > > > >        }
> > > > >      },
> > > > >      "stats" : {
> > > > >        "shards" : {
> > > > >          "0" : [ {
> > > > >             "state" : "INITIALIZING",
>
> ...
>
> read more »