Adding millions of documents, performance decay.

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding millions of documents, performance decay.

Fabio Pezzoni
Adding millions of documents, performance decay.

Hi,

I'm new to ElasticSearch and I'm trying to transfer a database of several millions of JSON documents to a Lucene index through ES. Currently we can use just a single node with 8 CPUs and we use the Java API to add sequentially each document. We didn't changed the default options, therefore our index has 5 shards. At the beginning the process was very fast! In a few our we added about 50 millions of documents, then the performance gradually fell and currently it can take seconds to add a single document. Query performance are still very good!
There is a way to overtake this situation? Maybe changing the setting or the number of shards...

Thank you very much, Fabio.

Here some node stats:

<raw>{"ok":true,"cluster_name":"twitter","nodes":{"j5RTlp7jTreoAlE6Tb3gwA":{"name":"xxx","transport_address":"inet[/xxx.xxx.xxx.xxx:9300]","hostname":"xxx","http_address":"inet[/xxx.xxx.xxx.xxx:9200]","settings":{"path.home":"/home/twitter/elasticsearch-0.19.11","foreground":"yes","logger.prefix":"","max-open-files":"true","node.name":"xxx","cluster.name":"twitter","name":"xxx","path.logs":"/home/twitter/elasticsearch-0.19.11/logs"},"os":{"refresh_interval":1000,"cpu":{"vendor":"Intel","model":"Xeon","mhz":3192,"total_cores":8,"total_sockets":8,"cores_per_socket":16,"cache_size":"8kb","cache_size_in_bytes":8192},"mem":{"total":"15.4gb","total_in_bytes":16543477760},"swap":{"total":"3.9gb","total_in_bytes":4294963200}},"process":{"refresh_interval":1000,"id":20301,"max_file_descriptors":65535},"jvm":{"pid":20301,"version":"1.6.0_34","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"20.9-b04","vm_vendor":"Sun Microsystems Inc.","start_time":1354178513836,"mem":{"heap_init":"256mb","heap_init_in_bytes":268435456,"heap_max":"1011.2mb","heap_max_in_bytes":1060372480,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"1011.2mb","direct_max_in_bytes":1060372480}}}}}</raw>

Index stats:

<raw>{"ok":true,"_shards":{"total":10,"successful":5,"failed":0},"_all":{"primaries":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"total":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"indices":{"twitter":{"primaries":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"total":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}}}}}}</raw>

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Adding millions of documents, performance decay.

Igor Motov-3
If you haven't done this yet, set refresh interval to -1 by running:

curl -XPUT localhost:9200/twitter/_settings -d '{
    "index" : {
        "refresh_interval" : "-1"
    }
}'

when you are done with bulk reindexing you can turn it back on by running 

curl -XPUT localhost:9200/twitter/_settings -d '{
    "index" : {
        "refresh_interval" : "1s"
    }
}'


On Friday, November 30, 2012 5:57:29 AM UTC-5, Fabio Pezzoni wrote:
Adding millions of documents, performance decay.

Hi,

I'm new to ElasticSearch and I'm trying to transfer a database of several millions of JSON documents to a Lucene index through ES. Currently we can use just a single node with 8 CPUs and we use the Java API to add sequentially each document. We didn't changed the default options, therefore our index has 5 shards. At the beginning the process was very fast! In a few our we added about 50 millions of documents, then the performance gradually fell and currently it can take seconds to add a single document. Query performance are still very good!
There is a way to overtake this situation? Maybe changing the setting or the number of shards...

Thank you very much, Fabio.

Here some node stats:

<raw>{"ok":true,"cluster_name":"twitter","nodes":{"j5RTlp7jTreoAlE6Tb3gwA":{"name":"xxx","transport_address":"inet[/xxx.xxx.xxx.xxx:9300]","hostname":"xxx","http_address":"inet[/xxx.xxx.xxx.xxx:9200]","settings":{"path.home":"/home/twitter/elasticsearch-0.19.11","foreground":"yes","logger.prefix":"","max-open-files":"true","node.name":"xxx","cluster.name":"twitter","name":"xxx","path.logs":"/home/twitter/elasticsearch-0.19.11/logs"},"os":{"refresh_interval":1000,"cpu":{"vendor":"Intel","model":"Xeon","mhz":3192,"total_cores":8,"total_sockets":8,"cores_per_socket":16,"cache_size":"8kb","cache_size_in_bytes":8192},"mem":{"total":"15.4gb","total_in_bytes":16543477760},"swap":{"total":"3.9gb","total_in_bytes":4294963200}},"process":{"refresh_interval":1000,"id":20301,"max_file_descriptors":65535},"jvm":{"pid":20301,"version":"1.6.0_34","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"20.9-b04","vm_vendor":"Sun Microsystems Inc.","start_time":1354178513836,"mem":{"heap_init":"256mb","heap_init_in_bytes":268435456,"heap_max":"1011.2mb","heap_max_in_bytes":1060372480,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"1011.2mb","direct_max_in_bytes":1060372480}}}}}</raw>

Index stats:

<raw>{"ok":true,"_shards":{"total":10,"successful":5,"failed":0},"_all":{"primaries":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"total":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"indices":{"twitter":{"primaries":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"total":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}}}}}}</raw>

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Adding millions of documents, performance decay.

Michael Sick
Also, have you tried to group you documents into bulk statements?


On Fri, Nov 30, 2012 at 11:38 AM, Igor Motov <[hidden email]> wrote:
If you haven't done this yet, set refresh interval to -1 by running:

curl -XPUT localhost:9200/twitter/_settings -d '{
    "index" : {
        "refresh_interval" : "-1"
    }
}'

when you are done with bulk reindexing you can turn it back on by running 

curl -XPUT localhost:9200/twitter/_settings -d '{
    "index" : {
        "refresh_interval" : "1s"
    }
}'


On Friday, November 30, 2012 5:57:29 AM UTC-5, Fabio Pezzoni wrote:
Adding millions of documents, performance decay.

Hi,

I'm new to ElasticSearch and I'm trying to transfer a database of several millions of JSON documents to a Lucene index through ES. Currently we can use just a single node with 8 CPUs and we use the Java API to add sequentially each document. We didn't changed the default options, therefore our index has 5 shards. At the beginning the process was very fast! In a few our we added about 50 millions of documents, then the performance gradually fell and currently it can take seconds to add a single document. Query performance are still very good!
There is a way to overtake this situation? Maybe changing the setting or the number of shards...

Thank you very much, Fabio.

Here some node stats:

<raw>{"ok":true,"cluster_name":"twitter","nodes":{"j5RTlp7jTreoAlE6Tb3gwA":{"name":"xxx","transport_address":"inet[/xxx.xxx.xxx.xxx:9300]","hostname":"xxx","http_address":"inet[/xxx.xxx.xxx.xxx:9200]","settings":{"path.home":"/home/twitter/elasticsearch-0.19.11","foreground":"yes","logger.prefix":"","max-open-files":"true","node.name":"xxx","cluster.name":"twitter","name":"xxx","path.logs":"/home/twitter/elasticsearch-0.19.11/logs"},"os":{"refresh_interval":1000,"cpu":{"vendor":"Intel","model":"Xeon","mhz":3192,"total_cores":8,"total_sockets":8,"cores_per_socket":16,"cache_size":"8kb","cache_size_in_bytes":8192},"mem":{"total":"15.4gb","total_in_bytes":16543477760},"swap":{"total":"3.9gb","total_in_bytes":4294963200}},"process":{"refresh_interval":1000,"id":20301,"max_file_descriptors":65535},"jvm":{"pid":20301,"version":"1.6.0_34","vm_name":"Java HotSpot(TM) 64-Bit Server VM","vm_version":"20.9-b04","vm_vendor":"Sun Microsystems Inc.","start_time":1354178513836,"mem":{"heap_init":"256mb","heap_init_in_bytes":268435456,"heap_max":"1011.2mb","heap_max_in_bytes":1060372480,"non_heap_init":"23.1mb","non_heap_init_in_bytes":24313856,"non_heap_max":"130mb","non_heap_max_in_bytes":136314880,"direct_max":"1011.2mb","direct_max_in_bytes":1060372480}}}}}</raw>

Index stats:

<raw>{"ok":true,"_shards":{"total":10,"successful":5,"failed":0},"_all":{"primaries":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"total":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"indices":{"twitter":{"primaries":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}},"total":{"docs":{"count":54298508,"deleted":24537},"store":{"size":"50.6gb","size_in_bytes":54435026631,"throttle_time":"0s","throttle_time_in_millis":0},"indexing":{"index_total":63351487,"index_time":"2d","index_time_in_millis":173455844,"index_current":0,"delete_total":0,"delete_time":"0s","delete_time_in_millis":0,"delete_current":0},"get":{"total":0,"time":"0s","time_in_millis":0,"exists_total":0,"exists_time":"0s","exists_time_in_millis":0,"missing_total":0,"missing_time":"0s","missing_time_in_millis":0,"current":0},"search":{"query_total":185,"query_time":"1.2m","query_time_in_millis":73521,"query_current":0,"fetch_total":71,"fetch_time":"4.6m","fetch_time_in_millis":279532,"fetch_current":0},"merges":{"current":6,"current_docs":60,"current_size":"462.5kb","current_size_in_bytes":473615,"total":761368,"total_time":"3.1d","total_time_in_millis":276398054,"total_docs":224483235,"total_size":"251.7gb","total_size_in_bytes":270307183589},"refresh":{"total":177694,"total_time":"1.3d","total_time_in_millis":116764453},"flush":{"total":7797,"total_time":"1.2d","total_time_in_millis":111453937}}}}}}</raw>

--
 
 

--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Adding millions of documents, performance decay.

joergprante@gmail.com
In reply to this post by Fabio Pezzoni

You experience massive GC because you use the default out-of-the-box maximum JVM heap settings of 1 GB. You are lucky you could even add 50 million docs with that small setting! But, you have 16 GB RAM. As a rule of thumb, assign around 50% RAM (4-8 GB) to Elasticsearch's heap. Check bin/elasticsearch.in.sh for ES_MAX_MEM or better ES_HEAP_SIZE. Other advanced tuning is also available, but, check bulk indexing first. Happy indexing!

Best regards,

Jörg


--
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Adding millions of documents, performance decay.

Fabio Pezzoni
Thank you very much for the advises! I was already using Java API bulk
statements. Now with 8g of heap and refresh_interval=-1 it works far
better. It's still slower than at the beginning but maybe it's normal
for a single-node cluster (it indexes in bursts). I hope to have more
nodes and power soon!

Fabio

On Fri, Nov 30, 2012 at 8:27 PM, Jörg Prante <[hidden email]> wrote:

>
> You experience massive GC because you use the default out-of-the-box maximum
> JVM heap settings of 1 GB. You are lucky you could even add 50 million docs
> with that small setting! But, you have 16 GB RAM. As a rule of thumb, assign
> around 50% RAM (4-8 GB) to Elasticsearch's heap. Check
> bin/elasticsearch.in.sh for ES_MAX_MEM or better ES_HEAP_SIZE. Other
> advanced tuning is also available, but, check bulk indexing first. Happy
> indexing!
>
> Best regards,
>
> Jörg
>
>
> --
>
>

--


Reply | Threaded
Open this post in threaded view
|

Re: Adding millions of documents, performance decay.

Hadar Rottenberg
Hey Fabbio,
Would it be possible for you to post some benchmarks and statistics about your data set.
such as
how big is the dataset?
what is the avg document size?
how long did it take to index 50M documents?
querying benchmarks?queries per second?

Thanks

On Monday, December 3, 2012 12:13:39 PM UTC+2, Fabio Pezzoni wrote:
Thank you very much for the advises! I was already using Java API bulk
statements. Now with 8g of heap and refresh_interval=-1 it works far
better. It's still slower than at the beginning but maybe it's normal
for a single-node cluster (it indexes in bursts). I hope to have more
nodes and power soon!

Fabio

On Fri, Nov 30, 2012 at 8:27 PM, Jörg Prante <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="ooEaAn6M7OQJ">joerg...@...> wrote:

>
> You experience massive GC because you use the default out-of-the-box maximum
> JVM heap settings of 1 GB. You are lucky you could even add 50 million docs
> with that small setting! But, you have 16 GB RAM. As a rule of thumb, assign
> around 50% RAM (4-8 GB) to Elasticsearch's heap. Check
> bin/elasticsearch.in.sh for ES_MAX_MEM or better ES_HEAP_SIZE. Other
> advanced tuning is also available, but, check bulk indexing first. Happy
> indexing!
>
> Best regards,
>
> Jörg
>
>
> --
>
>

--