Bulk insert vs Single insert

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Bulk insert vs Single insert

mike.giardinelli

Hi All,

The primary dev managing our ES cluster has made the statement that single document writes to ES will only provide us with roughly 30 / 40 writes a second. Whereas the bulk operations will give us more in the range of a 1,000+. I realize that bulk is always faster (or is generally) and there are hardware / environment constraints to any process. However, with other technologies you do not pay such a heavy price for single insertions. I am obviously ignorant when it comes to ES, but why do you pay such a heavy price for document writes in ES? Or are we just not properly informed?

Environment:

  • Apache Storm writes to our ES cluster
  • Currently all of the writes are processed in bulk operations.
ES Configuration:

  • 11 data nodes
    • 2x AMD Opteron(TM) Processor 6272 (16 cores @ 2.1/3.0 GHz, 16 MB L3 cache)

    • 256 GB RAM
    • 12 TB (7200 RPM platter disks in LVM ext4 configuration)
  • ES configuration
    • two instances per node (16 cores per instance)
    • 30 GB RAM lock-in per instance (max recommended by ES)
    • 18 shards per index (empirically best combo of RAM vs. shard trade-off)

Any information / suggestions would be greatly appreciated.


Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8ad4c98d-34ca-4205-b763-88e1392cf57c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk insert vs Single insert

Mark Walkom
From what I understand (which may not be 100% right), most of the overhead is with generating and dealing with the HTTP request as it's a heavy operation.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com

On 11 October 2014 03:35, <[hidden email]> wrote:

Hi All,

The primary dev managing our ES cluster has made the statement that single document writes to ES will only provide us with roughly 30 / 40 writes a second. Whereas the bulk operations will give us more in the range of a 1,000+. I realize that bulk is always faster (or is generally) and there are hardware / environment constraints to any process. However, with other technologies you do not pay such a heavy price for single insertions. I am obviously ignorant when it comes to ES, but why do you pay such a heavy price for document writes in ES? Or are we just not properly informed?

Environment:

  • Apache Storm writes to our ES cluster
  • Currently all of the writes are processed in bulk operations.
ES Configuration:

  • 11 data nodes
    • 2x AMD Opteron(TM) Processor 6272 (16 coresĀ @ 2.1/3.0 GHz, 16 MB L3 cache)

    • 256 GB RAM
    • 12 TB (7200 RPM platter disks in LVM ext4 configuration)
  • ES configuration
    • two instances per node (16 cores per instance)
    • 30 GB RAM lock-in per instance (max recommended by ES)
    • 18 shards per index (empirically best combo of RAM vs. shard trade-off)

Any information / suggestions would be greatly appreciated.


Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8ad4c98d-34ca-4205-b763-88e1392cf57c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bveakO%2B_-SCt7LLJJQeCeQ9zuiJzmpa6pm1azAga%2BP8Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.