Reindex into another Elasticsearch

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Reindex into another Elasticsearch

Frederico Ferreira
This is my first e-mail, so, if this problem is already explained, i'm sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master (data false) and 10 slaves, and every index configured by day (from logstash). Since we changed to a hourly index, after 2 weeks and a needed a maintenance reboot, Elasticsearch wasn't able to start properly. It started assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core, 16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Reindex into another Elasticsearch

Mark Walkom-2
1 shard per index doesn't make a lot of sense unless you have very small amounts of data, You'd be better off going back to the default as you are solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira <[hidden email]> wrote:
This is my first e-mail, so, if this problem is already explained, i'm sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master (data false) and 10 slaves, and every index configured by day (from logstash). Since we changed to a hourly index, after 2 weeks and a needed a maintenance reboot, Elasticsearch wasn't able to start properly. It started assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core, 16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Reindex into another Elasticsearch

Frederico Ferreira
I'm sorry for the long delay it took to answer.
Every index is a folder inside the data folder. I just simply compressed those folders and sent to S3.
But, now, we just found an "answer":
  • we built (at another dc) another ES cluster and we've put those folders inside the data directory
  • this is the part that i didn't participate:
    • we had a Logstash querying ES and outputting to our ES cluster
  • That's our answer to what we were looking for


Att
Frederico Ferreira
(21) 98714-1445

2015-04-27 18:28 GMT-03:00 Mark Walkom <[hidden email]>:
1 shard per index doesn't make a lot of sense unless you have very small amounts of data, You'd be better off going back to the default as you are solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira <[hidden email]> wrote:
This is my first e-mail, so, if this problem is already explained, i'm sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master (data false) and 10 slaves, and every index configured by day (from logstash). Since we changed to a hourly index, after 2 weeks and a needed a maintenance reboot, Elasticsearch wasn't able to start properly. It started assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core, 16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3i2FzNpBdrtqV6MOOqUSMd7x-FEi_ZEgA38V1CZOnJBpw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Reindex into another Elasticsearch

Mark Walkom-2
You're better off using the snapshot and restore functionality than doing your method.

I'm not sure what you are trying to do though

On 15 May 2015 at 09:49, Frederico Ferreira <[hidden email]> wrote:
I'm sorry for the long delay it took to answer.
Every index is a folder inside the data folder. I just simply compressed those folders and sent to S3.
But, now, we just found an "answer":
  • we built (at another dc) another ES cluster and we've put those folders inside the data directory
  • this is the part that i didn't participate:
    • we had a Logstash querying ES and outputting to our ES cluster
  • That's our answer to what we were looking for


Att
Frederico Ferreira
(21) 98714-1445

2015-04-27 18:28 GMT-03:00 Mark Walkom <[hidden email]>:
1 shard per index doesn't make a lot of sense unless you have very small amounts of data, You'd be better off going back to the default as you are solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira <[hidden email]> wrote:
This is my first e-mail, so, if this problem is already explained, i'm sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master (data false) and 10 slaves, and every index configured by day (from logstash). Since we changed to a hourly index, after 2 weeks and a needed a maintenance reboot, Elasticsearch wasn't able to start properly. It started assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core, 16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3i2FzNpBdrtqV6MOOqUSMd7x-FEi_ZEgA38V1CZOnJBpw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X89JbL0T1GwoLct%3DS4uK6EPS_EZLf538raqOH5pxq2ggQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Reindex into another Elasticsearch

bitsofinfo.g
In reply to this post by Frederico Ferreira
Also try this tool for more easily aggregating FS repo snapshots across a cluster for restoring on a different cluster. I had to make this tool for a similar scenario I had, might help in your situation too https://github.com/bitsofinfo/elasticsearch-snapshot-manager

On Thursday, May 14, 2015 at 5:50:33 PM UTC-6, Frederico Barnard wrote:
I'm sorry for the long delay it took to answer.
Every index is a folder inside the data folder. I just simply compressed those folders and sent to S3.
But, now, we just found an "answer":
  • we built (at another dc) another ES cluster and we've put those folders inside the data directory
  • this is the part that i didn't participate:
    • we had a Logstash querying ES and outputting to our ES cluster
  • That's our answer to what we were looking for


Att
Frederico Ferreira
(21) 98714-1445

2015-04-27 18:28 GMT-03:00 Mark Walkom <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="mrsv3zezMqgJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">markw...@...>:
1 shard per index doesn't make a lot of sense unless you have very small amounts of data, You'd be better off going back to the default as you are solving the wrong problem there.

What are these backup file you mention, how did you get them out of ES?

On 27 April 2015 at 21:50, Frederico Ferreira <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="mrsv3zezMqgJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">frede...@...> wrote:
This is my first e-mail, so, if this problem is already explained, i'm sorry, couldn't find out where it is.
I'm out of ideas. This is my question:
I had an Elasticsearch up and running with 1 replica, 5 shards, 1 master (data false) and 10 slaves, and every index configured by day (from logstash). Since we changed to a hourly index, after 2 weeks and a needed a maintenance reboot, Elasticsearch wasn't able to start properly. It started assigning unassigned shards and a lot of timeouts came to happen.

After 5 days trying to recover, we decided to change the configuration of our cluster to 1 master (data false), 10 salves and 1 shard 2 replicas indexes, from scratch, without any old index.
My task now is to reindex those lost indexes. This is my problem:
I have 10 backup files (up to 400gb each) and i'm looking for ways to reindex those indexes (little by little).

  • Should i copy those indexes folder to the new cluster folder?
    • I don't need to change to a daily shard, i just need Elasticsearch to assign those indexes.
  • Is there any way i'm able to differentiate replica folders from shards folders?

We're using Elasticsearch 1.4.4 and each Elasticsearch is in an 8-core, 16gb ram dedicated machine.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="mrsv3zezMqgJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAM0Xh3hG7BfiTwDgc0cCseTg4dVNFvav6LWvOmHS_-0Q3Ey0Tw%40mail.gmail.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="mrsv3zezMqgJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92zDgRK18wNak-Q%2BsJVP8C8%2BqQz70bvxu_jG%2BPmbq9CQ%40mail.gmail.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c6feaec-276d-4fc2-8600-244acd1d1571%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.