Fault tolerant elasticsearch (JVM Heap OOM)

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Fault tolerant elasticsearch (JVM Heap OOM)

foufos
Dear elasticsearch users

I am running a PHP web application whose data layer is based out of 3 elasticsearch nodes.

Once in a while there might be an individual node failing (e.g. recently one run into a JVM Heap OOM) but the cluster would still become green (2 nodes required) so I would like to make the application fault tollerant.

What is the best practice to avoid sending requests to the instance that fails?

Would you implement healthchecks at the application layer?

Any examples or advice would be much appreciated

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Fault tolerant elasticsearch (JVM Heap OOM)

joergprante@gmail.com
What are you exactly doing? Are you indexing documents? Are you
searching for documents? Do you keep your queries in a logfile so you
can trace what is going on? Did you enable logging at GC level in
Elasticsearch? Have you a strategy for sizing your application, that is,
have you calculated in advance how much resources you will need?

Jörg

Am 02.04.13 18:03, schrieb foufos:

> Dear elasticsearch users
>
> I am running a PHP web application whose data layer is based out of 3
> elasticsearch nodes.
>
> Once in a while there might be an individual node failing (e.g.
> recently one run into a JVM Heap OOM) but the cluster would still
> become green (2 nodes required) so I would like to make the
> application fault tollerant.
>
> What is the best practice to avoid sending requests to the instance
> that fails?
>
> Would you implement healthchecks at the application layer?
>
> Any examples or advice would be much appreciated
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Fault tolerant elasticsearch (JVM Heap OOM)

q42jaap
We've considered using haproxy to loadbalance (round-robin) the REST calls to the different nodes. haproxy can easily do a health check, by pinging the machine or sending HTTP requests and checking the response.
In the end we ended up with an off-the-shelve-loadbalancer from our hosting company. This seems to work just fine, but we haven't tested that ourselves.

Jaap

Jaap Taal
 
[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl | Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK 30164662 ]


On Tue, Apr 2, 2013 at 6:07 PM, Jörg Prante <[hidden email]> wrote:
What are you exactly doing? Are you indexing documents? Are you searching for documents? Do you keep your queries in a logfile so you can trace what is going on? Did you enable logging at GC level in Elasticsearch? Have you a strategy for sizing your application, that is, have you calculated in advance how much resources you will need?

Jörg

Am 02.04.13 18:03, schrieb foufos:

Dear elasticsearch users

I am running a PHP web application whose data layer is based out of 3 elasticsearch nodes.

Once in a while there might be an individual node failing (e.g. recently one run into a JVM Heap OOM) but the cluster would still become green (2 nodes required) so I would like to make the application fault tollerant.

What is the best practice to avoid sending requests to the instance that fails?

Would you implement healthchecks at the application layer?

Any examples or advice would be much appreciated
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Fault tolerant elasticsearch (JVM Heap OOM)

Alexander Reelsen-2
In reply to this post by foufos
Hey

another solution might be to run a client node on your web application server. This is an elasticsearch node, which does not hold any data and is not allowed to become master, but still knows the clusters internal structure and which nodes can be queried (and a little bit more). There is a comment about that configuration in the default elasticsearch.yml configuration as well (which is not the best place to put it obvously).


--Alex


On Tue, Apr 2, 2013 at 6:03 PM, foufos <[hidden email]> wrote:
Dear elasticsearch users

I am running a PHP web application whose data layer is based out of 3 elasticsearch nodes.

Once in a while there might be an individual node failing (e.g. recently one run into a JVM Heap OOM) but the cluster would still become green (2 nodes required) so I would like to make the application fault tollerant.

What is the best practice to avoid sending requests to the instance that fails?

Would you implement healthchecks at the application layer?

Any examples or advice would be much appreciated

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Fault tolerant elasticsearch (JVM Heap OOM)

Mohammady Mahdy
@jorge What are the most common causes for OOM in ES?

@foufos I know if you use the java client and all addresses to the transport client it will manage this for you. otherwise you can just have a list of servers to try your request against if you don't want a load balancer (I am assuming those would be the only ways if you opt for using the REST api via http)

On Thursday, April 4, 2013 10:33:33 AM UTC+4, Alexander Reelsen wrote:
Hey

another solution might be to run a client node on your web application server. This is an elasticsearch node, which does not hold any data and is not allowed to become master, but still knows the clusters internal structure and which nodes can be queried (and a little bit more). There is a comment about that configuration in the default elasticsearch.yml configuration as well (which is not the best place to put it obvously).


--Alex


On Tue, Apr 2, 2013 at 6:03 PM, foufos <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="-hP52urrbjwJ">fou...@...> wrote:
Dear elasticsearch users

I am running a PHP web application whose data layer is based out of 3 elasticsearch nodes.

Once in a while there might be an individual node failing (e.g. recently one run into a JVM Heap OOM) but the cluster would still become green (2 nodes required) so I would like to make the application fault tollerant.

What is the best practice to avoid sending requests to the instance that fails?

Would you implement healthchecks at the application layer?

Any examples or advice would be much appreciated

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="-hP52urrbjwJ">elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Fault tolerant elasticsearch (JVM Heap OOM)

joergprante@gmail.com
OOM happens when the heap size is not sufficient.

In ES it has to be considered for what workloads heap space is required:

- for large segment merging. The bigger the index grows, the more heap
is required for segment merging
- for large documents and large bulks while indexing
- for large result sets
- and for field caching for filtering and faceting

Finding a reasonable heap size requires some testing under different
workloads. There is no general rule for a "correct" heap size.

You can tackle OOM with scaling out (adding more nodes) or scaling up
(add more RAM per node) or streamline the resource consumption during
the lifecycle of the ES process (smaller segments while merging, smaller
bulk indexing, smaller query results, avoiding "bad" queries with too
heavy resource consumption)

Jörg

Am 04.04.2013 11:37, schrieb Mo:
> @jorge What are the most common causes for OOM in ES?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Fault tolerant elasticsearch (JVM Heap OOM)

foufos
After investigation it turns out we had a lot of exceptions due to wrong mapping attributes.
After fixing this, we haven't experienced another similar issue.
can this cause OOM?

@Jorg We have about 850K documents fairly small in size. and we also have routing set up to have less overhead.

So the problem is temporarily not triggered but we have to create a fall back in another server becomes unresponsive and we get another split brain scenario

So now we are considering of implementing a solution along the lines @alex suggested.
You think that by doing something like that you can avoid a split brain scenario?

thank you
foufos

On Thursday, 4 April 2013 15:28:03 UTC+3, Jörg Prante wrote:
OOM happens when the heap size is not sufficient.

In ES it has to be considered for what workloads heap space is required:

- for large segment merging. The bigger the index grows, the more heap
is required for segment merging
- for large documents and large bulks while indexing
- for large result sets
- and for field caching for filtering and faceting

Finding a reasonable heap size requires some testing under different
workloads. There is no general rule for a "correct" heap size.

You can tackle OOM with scaling out (adding more nodes) or scaling up
(add more RAM per node) or streamline the resource consumption during
the lifecycle of the ES process (smaller segments while merging, smaller
bulk indexing, smaller query results, avoiding "bad" queries with too
heavy resource consumption)

Jörg

Am 04.04.2013 11:37, schrieb Mo:
> @jorge What are the most common causes for OOM in ES?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.