Bulk operations

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Bulk operations

NevB
Hi,

Does ElasticSearch support bulk PUT requests.

By that I mean something like:

curl -XPUT http://localhost:9200/ -d \
'[{ _index: "twitter", _type: "user", _id: "1", _source: { name:
"Fred" }},  { _index: "twitter", _type: "user", _id: "2", _source:
{ name: "John" }}]'


This type of PUT is needed by AJAX apps which construct a list of
entries which are submitted as a single 'transaction' for example, an
order document which is comprised of multiple types

Thanks for your thoughts,

Neville
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Bulk operations

kimchy
Administrator
ElasticSearch does not currently support bulk operations. There are several reasons to support bulk operations, lets analyze them:

1. Bulk operations might imply transactionality (or atomicity at the very least). Meaning that either all operations succeed or fail. I don't see this feature getting into ElasticSearch in the near future because of the complexity in implementing it in distributed systems. If the operations go to different shards, this might means a two phase commit process, and even if all go to the same shard, its not simple to implement on top of Lucene and still maintain high throughput.

2. Bulk operations can be used to speed up processing. For example, send a single message with 1000 operations instead of 1000 messages. This does make sense for ElasticSearch (and the reason why bulking will be supported in the near future (no atomicity across all operations though, just a status on each one if it failed or not). But, you should know that ElasticSearch is highly optimized for concurrent usage and built using complete event driven IO architecture. So, if you have 1000 operations, simply fork 10 threads to do them, and you should get really good numbers (and I mean, really good numbers :) ) for now. Moreover, if you have a good http async client lib, then simply send all the indexing requests on a single thread, and register a listener for the results.

-shay.banon

On Wed, Feb 10, 2010 at 11:30 PM, NevB <[hidden email]> wrote:
Hi,

Does ElasticSearch support bulk PUT requests.

By that I mean something like:

curl -XPUT http://localhost:9200/ -d \
'[{ _index: "twitter", _type: "user", _id: "1", _source: { name:
"Fred" }},  { _index: "twitter", _type: "user", _id: "2", _source:
{ name: "John" }}]'


This type of PUT is needed by AJAX apps which construct a list of
entries which are submitted as a single 'transaction' for example, an
order document which is comprised of multiple types

Thanks for your thoughts,

Neville

Loading...