Moving from fs gateway type to cluster using S3/Cloudfiles

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Moving from fs gateway type to cluster using S3/Cloudfiles

darron
We currently have a single Elasticsearch box running 0.17.6 - it's using the fs gateway type.

We're really liking the tool and it's working great for our uses - now I am stepping in to deploy a cluster of boxes using 0.18.4. My goal is to:

1. Have a cluster of nodes running for speed and availability.
2. Use the S3/Cloudfiles gateway to keep gateway persistent and more durable with VPS failures.

I have been reading the mailing list/guides/tutorials/etc and have come up with this plan but have a couple questions.

First the proposed plan:

1. Flush and shutdown the currently running 0.17.6.
2. Update the binaries to 0.18.4.
3. Start the new binaries and make sure they work.
4. Add another node to the cluster.

Here are my questions:

1. Because it's currently using the fs gateway, unless they all have access to that directory of files then the others will not be able to recover from failure properly. We should have used local gateway for that single box right?

2. If I add the new nodes with the S3/Cloudfiles gateway, will they replicate all of the indexes properly to S3/Cloudfiles? Or do they only add the items when they're indexed? I have read a ton about people trying to do that, looks like they need to re-index so that the offsite gateways get populated - is that correct?

The clustering seems to be pretty magical as far as discovery goes - this is the config that worked great with Rackspace Cloud boxes: https://gist.github.com/1390228

1. Do the boxes have to be on the same subnet to find each other?

I couldn't find any details about that and it all worked flawlessly, but wondering how I'll add boxes in the future when they run out of IP addresses that are "nearby". Or if that even matters.

Sorry if I've missed something, I have read everything I can find and I'm just trying to make sure on my last few questions. I found a lot of information that appeared to apply to old releases and just wanted to clarify.

We're really loving the product and I spent much of this weekend adding and removing nodes to my cluster, dropping and adding all sorts of indexes and watching it rebalance - very nice work so far.
Reply | Threaded
Open this post in threaded view
|

Re: Moving from fs gateway type to cluster using S3/Cloudfiles

kimchy
Administrator
To be honest, I am lost, here are some points:

1. You say you use Rackspace, and it works, but the config points to AWS configuration?
2. There is no rackspace cloudfiles support to act as gateway, only s3.
3. In any case for 2, you should start with local gateway, its perfectly fine to use on one node, to many. I don't understand why you used fs gateway in your one node scenario now.
4. Changing gateway implementation requires reindexing.

On Thu, Nov 24, 2011 at 1:36 AM, darron <[hidden email]> wrote:
We currently have a single Elasticsearch box running 0.17.6 - it's using the fs gateway type.

We're really liking the tool and it's working great for our uses - now I am stepping in to deploy a cluster of boxes using 0.18.4. My goal is to:

1. Have a cluster of nodes running for speed and availability.
2. Use the S3/Cloudfiles gateway to keep gateway persistent and more durable with VPS failures.

I have been reading the mailing list/guides/tutorials/etc and have come up with this plan but have a couple questions.

First the proposed plan:

1. Flush and shutdown the currently running 0.17.6.
2. Update the binaries to 0.18.4.
3. Start the new binaries and make sure they work.
4. Add another node to the cluster.

Here are my questions:

1. Because it's currently using the fs gateway, unless they all have access to that directory of files then the others will not be able to recover from failure properly. We should have used local gateway for that single box right?

2. If I add the new nodes with the S3/Cloudfiles gateway, will they replicate all of the indexes properly to S3/Cloudfiles? Or do they only add the items when they're indexed? I have read a ton about people trying to do that, looks like they need to re-index so that the offsite gateways get populated - is that correct?

The clustering seems to be pretty magical as far as discovery goes - this is the config that worked great with Rackspace Cloud boxes: https://gist.github.com/1390228

1. Do the boxes have to be on the same subnet to find each other?

I couldn't find any details about that and it all worked flawlessly, but wondering how I'll add boxes in the future when they run out of IP addresses that are "nearby". Or if that even matters.

Sorry if I've missed something, I have read everything I can find and I'm just trying to make sure on my last few questions. I found a lot of information that appeared to apply to old releases and just wanted to clarify.

We're really loving the product and I spent much of this weekend adding and removing nodes to my cluster, dropping and adding all sorts of indexes and watching it rebalance - very nice work so far.

Reply | Threaded
Open this post in threaded view
|

Re: Moving from fs gateway type to cluster using S3/Cloudfiles

Darron Froese
On Thu, Nov 24, 2011 at 7:02 AM, Shay Banon <[hidden email]> wrote:
> To be honest, I am lost, here are some points:
> 1. You say you use Rackspace, and it works, but the config points to AWS
> configuration?

Sorry for being confusing - we're using a Rackspace cloud VPS with the
S3 gateway.

> 2. There is no rackspace cloudfiles support to act as gateway, only s3.

No problem - I thought there was - was testing with S3 and will stick with that.

I must have gotten confused with the Gateway information here:

http://www.elasticsearch.org/blog/2010/05/11/here-comes-the-cloud.html

> 3. In any case for 2, you should start with local gateway, its perfectly
> fine to use on one node, to many. I don't understand why you used fs gateway
> in your one node scenario now.

I figured local should have been used - I didn't actually set that box
up but am stepping in and setting up the cluster now.

> 4. Changing gateway implementation requires reindexing.

No problem - we'll make that happen.

Thanks for the response - sorry for confusing it all.

It's an incredible tool - we're really happy to have found it and will
be using it for other projects going forward.
Reply | Threaded
Open this post in threaded view
|

Re: Moving from fs gateway type to cluster using S3/Cloudfiles

kimchy
Administrator
No problem :), answers below

On Thu, Nov 24, 2011 at 8:13 PM, Darron Froese <[hidden email]> wrote:
On Thu, Nov 24, 2011 at 7:02 AM, Shay Banon <[hidden email]> wrote:
> To be honest, I am lost, here are some points:
> 1. You say you use Rackspace, and it works, but the config points to AWS
> configuration?

Sorry for being confusing - we're using a Rackspace cloud VPS with the
S3 gateway.

I see, probably bad idea, the perf talking to S3 not within AWS is not amazing.
 

> 2. There is no rackspace cloudfiles support to act as gateway, only s3.

No problem - I thought there was - was testing with S3 and will stick with that.

I suggest you use local gateway.
 

I must have gotten confused with the Gateway information here:

http://www.elasticsearch.org/blog/2010/05/11/here-comes-the-cloud.html

Its an "old" post, before local gateway was implemented... :)
 


> 3. In any case for 2, you should start with local gateway, its perfectly
> fine to use on one node, to many. I don't understand why you used fs gateway
> in your one node scenario now.

I figured local should have been used - I didn't actually set that box
up but am stepping in and setting up the cluster now.

> 4. Changing gateway implementation requires reindexing.

No problem - we'll make that happen.

Just double checking that you will use local gateway.
 

Thanks for the response - sorry for confusing it all.

It's an incredible tool - we're really happy to have found it and will
be using it for other projects going forward.

Reply | Threaded
Open this post in threaded view
|

Re: Moving from fs gateway type to cluster using S3/Cloudfiles

Darron Froese
Hmm - I wanted to use S3 to have an offsite way to recover - in case
something happened with the cluster.

If we use the local gateway and backup the box every day, does that
have enough data to be able to recover data for the whole cluster?

Alternately, we could just setup some AWS boxes and use the S3 gateway
- that should solve the performance issues.

On Thu, Nov 24, 2011 at 11:44 AM, Shay Banon <[hidden email]> wrote:

> No problem :), answers below
>
> On Thu, Nov 24, 2011 at 8:13 PM, Darron Froese <[hidden email]> wrote:
>>
>> On Thu, Nov 24, 2011 at 7:02 AM, Shay Banon <[hidden email]> wrote:
>> > To be honest, I am lost, here are some points:
>> > 1. You say you use Rackspace, and it works, but the config points to AWS
>> > configuration?
>>
>> Sorry for being confusing - we're using a Rackspace cloud VPS with the
>> S3 gateway.
>
> I see, probably bad idea, the perf talking to S3 not within AWS is not
> amazing.
>
>>
>> > 2. There is no rackspace cloudfiles support to act as gateway, only s3.
>>
>> No problem - I thought there was - was testing with S3 and will stick with
>> that.
>
> I suggest you use local gateway.
>
>>
>> I must have gotten confused with the Gateway information here:
>>
>> http://www.elasticsearch.org/blog/2010/05/11/here-comes-the-cloud.html
>
> Its an "old" post, before local gateway was implemented... :)
>
>>
>> > 3. In any case for 2, you should start with local gateway, its perfectly
>> > fine to use on one node, to many. I don't understand why you used fs
>> > gateway
>> > in your one node scenario now.
>>
>> I figured local should have been used - I didn't actually set that box
>> up but am stepping in and setting up the cluster now.
>>
>> > 4. Changing gateway implementation requires reindexing.
>>
>> No problem - we'll make that happen.
>
> Just double checking that you will use local gateway.
>
>>
>> Thanks for the response - sorry for confusing it all.
>>
>> It's an incredible tool - we're really happy to have found it and will
>> be using it for other projects going forward.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Moving from fs gateway type to cluster using S3/Cloudfiles

kimchy
Administrator
You can use local gateway and backup the data locations.

On Thu, Nov 24, 2011 at 9:05 PM, Darron Froese <[hidden email]> wrote:
Hmm - I wanted to use S3 to have an offsite way to recover - in case
something happened with the cluster.

If we use the local gateway and backup the box every day, does that
have enough data to be able to recover data for the whole cluster?

Alternately, we could just setup some AWS boxes and use the S3 gateway
- that should solve the performance issues.

On Thu, Nov 24, 2011 at 11:44 AM, Shay Banon <[hidden email]> wrote:
> No problem :), answers below
>
> On Thu, Nov 24, 2011 at 8:13 PM, Darron Froese <[hidden email]> wrote:
>>
>> On Thu, Nov 24, 2011 at 7:02 AM, Shay Banon <[hidden email]> wrote:
>> > To be honest, I am lost, here are some points:
>> > 1. You say you use Rackspace, and it works, but the config points to AWS
>> > configuration?
>>
>> Sorry for being confusing - we're using a Rackspace cloud VPS with the
>> S3 gateway.
>
> I see, probably bad idea, the perf talking to S3 not within AWS is not
> amazing.
>
>>
>> > 2. There is no rackspace cloudfiles support to act as gateway, only s3.
>>
>> No problem - I thought there was - was testing with S3 and will stick with
>> that.
>
> I suggest you use local gateway.
>
>>
>> I must have gotten confused with the Gateway information here:
>>
>> http://www.elasticsearch.org/blog/2010/05/11/here-comes-the-cloud.html
>
> Its an "old" post, before local gateway was implemented... :)
>
>>
>> > 3. In any case for 2, you should start with local gateway, its perfectly
>> > fine to use on one node, to many. I don't understand why you used fs
>> > gateway
>> > in your one node scenario now.
>>
>> I figured local should have been used - I didn't actually set that box
>> up but am stepping in and setting up the cluster now.
>>
>> > 4. Changing gateway implementation requires reindexing.
>>
>> No problem - we'll make that happen.
>
> Just double checking that you will use local gateway.
>
>>
>> Thanks for the response - sorry for confusing it all.
>>
>> It's an incredible tool - we're really happy to have found it and will
>> be using it for other projects going forward.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Moving from fs gateway type to cluster using S3/Cloudfiles

Darron Froese
No problem - we'll do that then - thanks for your help.

On Thu, Nov 24, 2011 at 12:09 PM, Shay Banon <[hidden email]> wrote:

> You can use local gateway and backup the data locations.
>
> On Thu, Nov 24, 2011 at 9:05 PM, Darron Froese <[hidden email]> wrote:
>>
>> Hmm - I wanted to use S3 to have an offsite way to recover - in case
>> something happened with the cluster.
>>
>> If we use the local gateway and backup the box every day, does that
>> have enough data to be able to recover data for the whole cluster?
>>
>> Alternately, we could just setup some AWS boxes and use the S3 gateway
>> - that should solve the performance issues.
>>
>> On Thu, Nov 24, 2011 at 11:44 AM, Shay Banon <[hidden email]> wrote:
>> > No problem :), answers below
>> >
>> > On Thu, Nov 24, 2011 at 8:13 PM, Darron Froese <[hidden email]>
>> > wrote:
>> >>
>> >> On Thu, Nov 24, 2011 at 7:02 AM, Shay Banon <[hidden email]> wrote:
>> >> > To be honest, I am lost, here are some points:
>> >> > 1. You say you use Rackspace, and it works, but the config points to
>> >> > AWS
>> >> > configuration?
>> >>
>> >> Sorry for being confusing - we're using a Rackspace cloud VPS with the
>> >> S3 gateway.
>> >
>> > I see, probably bad idea, the perf talking to S3 not within AWS is not
>> > amazing.
>> >
>> >>
>> >> > 2. There is no rackspace cloudfiles support to act as gateway, only
>> >> > s3.
>> >>
>> >> No problem - I thought there was - was testing with S3 and will stick
>> >> with
>> >> that.
>> >
>> > I suggest you use local gateway.
>> >
>> >>
>> >> I must have gotten confused with the Gateway information here:
>> >>
>> >> http://www.elasticsearch.org/blog/2010/05/11/here-comes-the-cloud.html
>> >
>> > Its an "old" post, before local gateway was implemented... :)
>> >
>> >>
>> >> > 3. In any case for 2, you should start with local gateway, its
>> >> > perfectly
>> >> > fine to use on one node, to many. I don't understand why you used fs
>> >> > gateway
>> >> > in your one node scenario now.
>> >>
>> >> I figured local should have been used - I didn't actually set that box
>> >> up but am stepping in and setting up the cluster now.
>> >>
>> >> > 4. Changing gateway implementation requires reindexing.
>> >>
>> >> No problem - we'll make that happen.
>> >
>> > Just double checking that you will use local gateway.
>> >
>> >>
>> >> Thanks for the response - sorry for confusing it all.
>> >>
>> >> It's an incredible tool - we're really happy to have found it and will
>> >> be using it for other projects going forward.
>> >
>> >
>
>