I am tuning my Elasticsearch which consists of 3 dedicated servers:
2 Octocore Processors (16 Cores), with HT (32 threads)
32 GB RAM (DDR3)
We are running the latest beta2. We have allocated 16 GB to Elasticsearch.
We have only 1 index with 32 shards and 1 replica each, making it 64 shards in the cluster.
We have a lot of updates queries. In fact, we have only update queries with upserts. We are using routing to put similar data in the same shard. We see the rate of update queries ranging between 20/sec to 60/sec. This is going to increase to about 130-140/sec when we go live. Our queries are mostly filtered queries with a lot of use of term faceting.
I have allocated 40% memory to index_buffer_size (this was very random), which comes out to be around 6.4 GB. Average document size is 900 Bytes.
I have a feeling that the allocated buffer size is going waste as our indexing/updating/upserting rates are not so high. If my reasoning is correct, with 60 upserts/sec of docs that are of average size 1 KB, index_buffer_size 60 x 1 KB x 32 (shards) x 3 (keeping room for 3 times traffic) = 5760 KB (approx. 5 MB) should be more than enough. And by allocating 40% of the JVM, I am wasting the heap which I could probably use for Filter Cache and Field Data Cache.
Now I have the following questions:
Are my calculations correct?
Is there something else that I need to consider?
If the index_buffer_size is under utilized, will Elasticsearch be able to use a portion of it for something else when required?
Looking forward to some valuable feedback. Will be happy to share more details if required.