One primary shards are "lost" permanently when updating data

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

One primary shards are "lost" permanently when updating data

Jingzhao Ou
Hi, all, 

I have an urgent case and appreciate any help. I have an index with 5 shards, no replica, and running fine on a single AWS EC2 c3.large box for months.I now need to update around 1 million data entries in this index. I use the bulk operations to do the update on 50K batches. After about 300K updates, my bulk operation started to time out and failed. I then checked the index status and got 

  "_shards" : {
    "total" : 5,
    "successful" : 4,
    "failed" : 0
  },

One shard is lost permanently. I could still query for data. But when I did any index operations afterwards, it timed out every few tries. The only way to fix this is to wipe out the whole index and restore it from snapshots. I tested Elasticsearch 1.3.5 and 1.4.1, both have this symptom. I tried pausing for 10 seconds between each bulk updates, setting refresh_rate to -1. None of them helps. 

Strangely, I ran the same operation on my Windows 8 machine and things worked just fine there. Not sure why it failed so badly on AWS. My data is stored on a 100G EBS. Can anyone give me some help? I really worry about the data lost at this time. 

Thanks a lot!
Jingzhao

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4b0b35d3-6dfb-41fd-a887-80cddd01cb7b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.