ElasticSearch _bulk calls resulting in "socket hang up" despite small size

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

ElasticSearch _bulk calls resulting in "socket hang up" despite small size

inZania
I'm using elastical to connect to ElasticSearch via node.js.

In the process of profiling my app with Nodetime to attempt to improve performance, I noticed something odd. My ElasticSearch "PUT" requests to _bulk index are frequently resulting in a "socket hang up". Furthermore, these calls are taking huge amounts of CPU time.

I'm capping each _bulk index request @ 10 items to index, and as you can see, the content-length of the requests does not even reach 50Kb, so it is hard to imagine that the size is an issue. Yet, the response time is > 60 seconds and the CPU time is >10+ seconds! Yikes!!



In attempts to debug, I started running ElasticSearch in the foreground. I noticed this strange error:

        [2013-02-27 11:42:39,188][WARN ][index.gateway.s3         ] [Lady Mandarin] [network][1] failed to read commit point [commit-f34]
    java.io.IOException: Failed to get [commit-f34]
    	at org.elasticsearch.common.blobstore.support.AbstractBlobContainer.readBlobFully(AbstractBlobContainer.java:83)
    	at org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.buildCommitPoints(BlobStoreIndexShardGateway.java:847)
    	at org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.doSnapshot(BlobStoreIndexShardGateway.java:188)
    	at org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.snapshot(BlobStoreIndexShardGateway.java:160)
    	at org.elasticsearch.index.gateway.IndexShardGatewayService$2.snapshot(IndexShardGatewayService.java:271)
    	at org.elasticsearch.index.gateway.IndexShardGatewayService$2.snapshot(IndexShardGatewayService.java:265)
    	at org.elasticsearch.index.engine.robin.RobinEngine.snapshot(RobinEngine.java:1090)
    	at org.elasticsearch.index.shard.service.InternalIndexShard.snapshot(InternalIndexShard.java:496)
    	at org.elasticsearch.index.gateway.IndexShardGatewayService.snapshot(IndexShardGatewayService.java:265)
    	at org.elasticsearch.index.gateway.IndexShardGatewayService$SnapshotRunnable.run(IndexShardGatewayService.java:366)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    	at java.lang.Thread.run(Thread.java:619)

    Caused by: Status Code: 404, AWS Service: Amazon S3, AWS Request ID: ..., AWS Error Code: NoSuchKey, AWS Error Message: The specified key does not exist., S3 Extended Request ID: ....
    	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:548)
    	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:288)
    	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:170)
    	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:2632)
    	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:811)
    	at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:717)
    	at org.elasticsearch.cloud.aws.blobstore.AbstractS3BlobContainer$1.run(AbstractS3BlobContainer.java:73)


I'm aware that I'm using a deprecated gateway (the S3 bucket gateway). However, given that I have multiple servers running on the Amazon Cloud which need to share data (I use ElasticSearch for caching), I don't see any alternative until the ElasticSearch team releases a replacement for the S3 Bucket Gateway...

Other than this problem with the _bulk calls, I'm not seeing any problems. Searches etc. all return quickly and effectively.
Reply | Threaded
Open this post in threaded view
|

Re: ElasticSearch _bulk calls resulting in "socket hang up" despite small size

kimchy
Administrator
Which error are you seeing? You don't need to use s3 gateway to have a cluster of nodes setup on AWS…, s3 gateway simply consistently copies the data to s3, while the local gateway will know how to restore its state from each instance local storage (EBS or ephemeral).

On Feb 27, 2013, at 9:27 PM, inZania <[hidden email]> wrote:

> I'm using elastical to connect to ElasticSearch via node.js.
>
> In the process of profiling my app with Nodetime to attempt to improve
> performance, I noticed something odd. My ElasticSearch "PUT" requests to
> _bulk index are frequently resulting in a "socket hang up". Furthermore,
> these calls are taking huge amounts of CPU time.
>
> I'm capping each _bulk index request @ 10 items to index, and as you can
> see, the content-length of the requests does not even reach 50Kb, so it is
> hard to imagine that the size is an issue. Yet, the response time is > 60
> seconds and the CPU time is >10+ seconds! Yikes!!
>
> <http://elasticsearch-users.115913.n3.nabble.com/file/n4030677/EoIWe.png>
>
> In attempts to debug, I started running ElasticSearch in the foreground. I
> noticed this strange error:
>
>
>
>
> I'm aware that I'm using a deprecated gateway (the S3 bucket gateway).
> However, given that I have multiple servers running on the Amazon Cloud
> which need to share data (I use ElasticSearch for caching), I don't see any
> alternative until the ElasticSearch team releases a replacement for the S3
> Bucket Gateway...
>
> Other than this problem with the _bulk calls, I'm not seeing any problems.
> Searches etc. all return quickly and effectively.
>
>
>
> --
> View this message in context: http://elasticsearch-users.115913.n3.nabble.com/ElasticSearch-bulk-calls-resulting-in-socket-hang-up-despite-small-size-tp4030677.html
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: ElasticSearch _bulk calls resulting in "socket hang up" despite small size

joergprante@gmail.com
In reply to this post by inZania
It looks like you run into internal shard bulk response timeouts which
are by default 60 seconds. There should be something in the logs. Check
if you can successfully index documents to all the cluster nodes.

Best regards,

Jörg

Am 27.02.13 21:27, schrieb inZania:
> Yet, the response time is > 60
> seconds and the CPU time is >10+ seconds!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: ElasticSearch _bulk calls resulting in "socket hang up" despite small size

inZania
In reply to this post by kimchy
kimchy,

Data persistency between the instances is of very high importance to me. Eg, I might have one server which indexes a new document, and 30s later I'll need to search for that document from the other server... and then maybe one of the servers might go down (it is an auto-scaling array on the cloud, so servers go up and down all the time), and I still need the data to be available. I thought this wasn't possible without S3 as a shared data store -- am I wrong? If so, where can I find a tutorial on how to do this?