ELS memory consumption

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ELS memory consumption

mfeingold
How much memory should be available for the ELS process? I am running
ELS on a W2k8 server, the index has around 100M documents and the size
of the index is just under 50GB. It looks like the heap size of 2GB is
sufficient but mapped files take another 2.5 GB, so overall memory
allocated for the process is closer to 5GB.

My question is how can I estimate the amount of memory needed for
mapped files based on the size of the indexes? Also is there a way or
a need to control it?
Reply | Threaded
Open this post in threaded view
|

Re: ELS memory consumption

kimchy
Administrator
On Wed, Nov 2, 2011 at 6:15 PM, Michael Feingold <[hidden email]> wrote:
How much memory should be available for the ELS process? I am running
ELS on a W2k8 server, the index has around 100M documents and the size
of the index is just under 50GB. It looks like the heap size of 2GB is
sufficient but mapped files take another 2.5 GB, so overall memory
allocated for the process is closer to 5GB.

My question is how can I estimate the amount of memory needed for
mapped files based on the size of the indexes? Also is there a way or
a need to control it?

On windows, Lucene (and elasticsearch) will default to Mapped files for better performance. You can disable that and use simplefs index store type if you want. The mapped files will take the same size as the actual index files end up taking. Regarding actual heap, its hard to answer. Lucene internally will load data into memory to be able to search faster (basically intervals of terms), and there is the field "cache", which is basically used for sorting (on something other than score), and faceting (this is exposed using the node stats and index stats API).
Reply | Threaded
Open this post in threaded view
|

Re: ELS memory consumption

mfeingold
Hmm... Are you saying that it will try to load the entire 50GB into Mapped file(s). During the index load I got away with the box with 8GB RAM and no swap file. I've seen the mapped file allocate to the process to go all the way up to 5GB and then down to 2GB (on top of 1.5GB heap) without any visible impact on the document load speed - around 11 min/ 1M documents. 
The chart of memory allocated to the mapped files looks like a saw - as if there is some sort of GC going on there.
Does it mean that adding more RAM can increase performance? I am not too concerned about load performance - my data are pretty static, but search performance is important.

On Wed, Nov 2, 2011 at 2:04 PM, Shay Banon <[hidden email]> wrote:
On Wed, Nov 2, 2011 at 6:15 PM, Michael Feingold <[hidden email]> wrote:
How much memory should be available for the ELS process? I am running
ELS on a W2k8 server, the index has around 100M documents and the size
of the index is just under 50GB. It looks like the heap size of 2GB is
sufficient but mapped files take another 2.5 GB, so overall memory
allocated for the process is closer to 5GB.

My question is how can I estimate the amount of memory needed for
mapped files based on the size of the indexes? Also is there a way or
a need to control it?

On windows, Lucene (and elasticsearch) will default to Mapped files for better performance. You can disable that and use simplefs index store type if you want. The mapped files will take the same size as the actual index files end up taking. Regarding actual heap, its hard to answer. Lucene internally will load data into memory to be able to search faster (basically intervals of terms), and there is the field "cache", which is basically used for sorting (on something other than score), and faceting (this is exposed using the node stats and index stats API).