Primary balancing

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

Primary balancing

jjasinek
shay,

I know that shards are usually balanced amongst the nodes in an
elasticsearch cluster, but is there a way to balance the primaries.
We have a three node cluster with 3 shards and 1 replica.  We have
observed recently that all three of the primary shards are assigned to
one node.  Based on the fact that indexing is routed to the primary,
this could mean that one machine is dedicatedbto indexing.  Is there a
way to balance the primaries against nodes, or is it best to increase
shards to 6?  I'm afraid if I do that, there is still no guarantee
that each node will have close to two primaries on it.

Jason
Reply | Threaded
Open this post in threaded view
|

Re: Primary balancing

kimchy
Administrator
Even if you have 3 primaries allocated on a single node, with replication, other nodes will also be busy indexing. In general, there isn't a lot of difference between a primary and a replica, so there isn't an effort to balance primaries (or force it). There are some cases where primaries might work a bit more, for example, when doing searches / get and explicitly asking them to be executed on the primary shard, but thats not the common case.

On Fri, Oct 7, 2011 at 4:24 AM, jjasinek <[hidden email]> wrote:
shay,

I know that shards are usually balanced amongst the nodes in an
elasticsearch cluster, but is there a way to balance the primaries.
We have a three node cluster with 3 shards and 1 replica.  We have
observed recently that all three of the primary shards are assigned to
one node.  Based on the fact that indexing is routed to the primary,
this could mean that one machine is dedicatedbto indexing.  Is there a
way to balance the primaries against nodes, or is it best to increase
shards to 6?  I'm afraid if I do that, there is still no guarantee
that each node will have close to two primaries on it.

Jason

Reply | Threaded
Open this post in threaded view
|

Ec2 peformance

Gustavo Maia
In reply to this post by jjasinek

for better peformance is better I use 4 small instance or use a large  in the amazon cloud?
Small instance is 32 bits with one hard drive instance, and large have 2 hard drive instance and is 64 bits.
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

kimchy
Administrator
Large instances are preferable, but, do you mean 1 large instance Vs. 4 small instances?

On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]> wrote:

for better peformance is better I use 4 small instance or use a large  in the amazon cloud?
Small instance is 32 bits with one hard drive instance, and large have 2 hard drive instance and is 64 bits.

Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia
First thank you for your attention.

In the amazon price of 4 small is the same price of one large.
The small is 32 bit and have only one hard drive. The large is 64 bits
and have two hard drive.

Today I have 300GB of index which is distributed in three machines
that each machine has 6 15k rpm hard drive.
And doing this study was to migrate to the Amazon. So I doubt whether
this is best 4 small or 1 large.

My question would be to build a cluster of 40 large or 15 small
instance instance. I need to search back in less than 200ms on
average.
Is it possible to do this using elasticsearch at amazon?


Thanks for all. Elasticsearch is great project.

2011/10/12 Shay Banon <[hidden email]>
>
> Large instances are preferable, but, do you mean 1 large instance Vs. 4 small instances?
>
> On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]> wrote:
>>
>> for better peformance is better I use 4 small instance or use a large  in the amazon cloud?
>> Small instance is 32 bits with one hard drive instance, and large have 2 hard drive instance and is 64 bits.



--
Gustavo Maia
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia
correct:
My question would be to build a cluster of 40 SMALL or 15 LARGE
instance instance. I need to search back in less than 200ms on
average.


2011/10/12 Gustavo Maia <[hidden email]>:

> First thank you for your attention.
>
> In the amazon price of 4 small is the same price of one large.
> The small is 32 bit and have only one hard drive. The large is 64 bits
> and have two hard drive.
>
> Today I have 300GB of index which is distributed in three machines
> that each machine has 6 15k rpm hard drive.
> And doing this study was to migrate to the Amazon. So I doubt whether
> this is best 4 small or 1 large.
>
> My question would be to build a cluster of 40 large or 15 small
> instance instance. I need to search back in less than 200ms on
> average.
> Is it possible to do this using elasticsearch at amazon?
>
>
> Thanks for all. Elasticsearch is great project.
>
> 2011/10/12 Shay Banon <[hidden email]>
>>
>> Large instances are preferable, but, do you mean 1 large instance Vs. 4 small instances?
>>
>> On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]> wrote:
>>>
>>> for better peformance is better I use 4 small instance or use a large  in the amazon cloud?
>>> Small instance is 32 bits with one hard drive instance, and large have 2 hard drive instance and is 64 bits.
>
>
>
> --
> Gustavo Maia
>



--
Gustavo Maia
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Pavel Penchev
Hi,

we have similar requirements and we decided to go for the large instances. The search times were ok on the small instances (90% below 200ms) but the indexing suffered significantly (only 30% below 200ms, we have requirements for indexing as well). In comparison the large instances handle both search and indexing with 95% below 200ms.

Bear in mind this is specific to the type of documents you have and the searches you perform. Go for a 24h test I'd suggest.

Regards,
Pavel



On 13.10.2011 01:26, Gustavo Maia wrote:
correct:
My question would be to build a cluster of 40 SMALL or 15 LARGE
instance instance. I need to search back in less than 200ms on
average.


2011/10/12 Gustavo Maia [hidden email]:
First thank you for your attention.

In the amazon price of 4 small is the same price of one large.
The small is 32 bit and have only one hard drive. The large is 64 bits
and have two hard drive.

Today I have 300GB of index which is distributed in three machines
that each machine has 6 15k rpm hard drive.
And doing this study was to migrate to the Amazon. So I doubt whether
this is best 4 small or 1 large.

My question would be to build a cluster of 40 large or 15 small
instance instance. I need to search back in less than 200ms on
average.
Is it possible to do this using elasticsearch at amazon?


Thanks for all. Elasticsearch is great project.

2011/10/12 Shay Banon [hidden email]
Large instances are preferable, but, do you mean 1 large instance Vs. 4 small instances?

On Sat, Oct 8, 2011 at 11:16 PM, Gustavo [hidden email] wrote:
for better peformance is better I use 4 small instance or use a large  in the amazon cloud?
Small instance is 32 bits with one hard drive instance, and large have 2 hard drive instance and is 64 bits.


--
Gustavo Maia




Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia
Hi Pavel,
Thanks for all.

How many large instance do you have ?

2011/10/13 Pavel Penchev <[hidden email]>:

> Hi,
>
> we have similar requirements and we decided to go for the large instances.
> The search times were ok on the small instances (90% below 200ms) but the
> indexing suffered significantly (only 30% below 200ms, we have requirements
> for indexing as well). In comparison the large instances handle both search
> and indexing with 95% below 200ms.
>
> Bear in mind this is specific to the type of documents you have and the
> searches you perform. Go for a 24h test I'd suggest.
>
> Regards,
> Pavel
>
>
> On 13.10.2011 01:26, Gustavo Maia wrote:
>
> correct:
> My question would be to build a cluster of 40 SMALL or 15 LARGE
> instance instance. I need to search back in less than 200ms on
> average.
>
>
> 2011/10/12 Gustavo Maia <[hidden email]>:
>
> First thank you for your attention.
>
> In the amazon price of 4 small is the same price of one large.
> The small is 32 bit and have only one hard drive. The large is 64 bits
> and have two hard drive.
>
> Today I have 300GB of index which is distributed in three machines
> that each machine has 6 15k rpm hard drive.
> And doing this study was to migrate to the Amazon. So I doubt whether
> this is best 4 small or 1 large.
>
> My question would be to build a cluster of 40 large or 15 small
> instance instance. I need to search back in less than 200ms on
> average.
> Is it possible to do this using elasticsearch at amazon?
>
>
> Thanks for all. Elasticsearch is great project.
>
> 2011/10/12 Shay Banon <[hidden email]>
>
> Large instances are preferable, but, do you mean 1 large instance Vs. 4
> small instances?
>
> On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]> wrote:
>
> for better peformance is better I use 4 small instance or use a large  in
> the amazon cloud?
> Small instance is 32 bits with one hard drive instance, and large have 2
> hard drive instance and is 64 bits.
>
>
> --
> Gustavo Maia
>
>
>
>
>



--
Gustavo Maia
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

kimchy
Administrator
In general, I suggest using the xlarge instances in Amazon, simply because of the higher IO they provide and better performance consistency (at least based on what users have seen).

On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]> wrote:
Hi Pavel,
Thanks for all.

How many large instance do you have ?

2011/10/13 Pavel Penchev <[hidden email]>:
> Hi,
>
> we have similar requirements and we decided to go for the large instances.
> The search times were ok on the small instances (90% below 200ms) but the
> indexing suffered significantly (only 30% below 200ms, we have requirements
> for indexing as well). In comparison the large instances handle both search
> and indexing with 95% below 200ms.
>
> Bear in mind this is specific to the type of documents you have and the
> searches you perform. Go for a 24h test I'd suggest.
>
> Regards,
> Pavel
>
>
> On <a href="tel:13.10.2011" value="+13102011">13.10.2011 01:26, Gustavo Maia wrote:
>
> correct:
> My question would be to build a cluster of 40 SMALL or 15 LARGE
> instance instance. I need to search back in less than 200ms on
> average.
>
>
> 2011/10/12 Gustavo Maia <[hidden email]>:
>
> First thank you for your attention.
>
> In the amazon price of 4 small is the same price of one large.
> The small is 32 bit and have only one hard drive. The large is 64 bits
> and have two hard drive.
>
> Today I have 300GB of index which is distributed in three machines
> that each machine has 6 15k rpm hard drive.
> And doing this study was to migrate to the Amazon. So I doubt whether
> this is best 4 small or 1 large.
>
> My question would be to build a cluster of 40 large or 15 small
> instance instance. I need to search back in less than 200ms on
> average.
> Is it possible to do this using elasticsearch at amazon?
>
>
> Thanks for all. Elasticsearch is great project.
>
> 2011/10/12 Shay Banon <[hidden email]>
>
> Large instances are preferable, but, do you mean 1 large instance Vs. 4
> small instances?
>
> On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]> wrote:
>
> for better peformance is better I use 4 small instance or use a large  in
> the amazon cloud?
> Small instance is 32 bits with one hard drive instance, and large have 2
> hard drive instance and is 64 bits.
>
>
> --
> Gustavo Maia
>
>
>
>
>



--
Gustavo Maia

Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia
Thank you.

I'm thinking of using 10 instances c1.xlarge. Each instance
(c1.xlarge) has 4 hds.

So, I would use the appropriate version of the v0.18 elasticsearch,
which allows  configure 4 hds in the same elasticsearch process. Is
that best ?

Would have any peformance problem, if I have each shard of size 15GB,
thinking that in each instance would have 4 shards, one per HD?


2011/10/14 Shay Banon <[hidden email]>:

> In general, I suggest using the xlarge instances in Amazon, simply because
> of the higher IO they provide and better performance consistency (at least
> based on what users have seen).
>
> On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
> wrote:
>>
>> Hi Pavel,
>> Thanks for all.
>>
>> How many large instance do you have ?
>>
>> 2011/10/13 Pavel Penchev <[hidden email]>:
>> > Hi,
>> >
>> > we have similar requirements and we decided to go for the large
>> > instances.
>> > The search times were ok on the small instances (90% below 200ms) but
>> > the
>> > indexing suffered significantly (only 30% below 200ms, we have
>> > requirements
>> > for indexing as well). In comparison the large instances handle both
>> > search
>> > and indexing with 95% below 200ms.
>> >
>> > Bear in mind this is specific to the type of documents you have and the
>> > searches you perform. Go for a 24h test I'd suggest.
>> >
>> > Regards,
>> > Pavel
>> >
>> >
>> > On 13.10.2011 01:26, Gustavo Maia wrote:
>> >
>> > correct:
>> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> > instance instance. I need to search back in less than 200ms on
>> > average.
>> >
>> >
>> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >
>> > First thank you for your attention.
>> >
>> > In the amazon price of 4 small is the same price of one large.
>> > The small is 32 bit and have only one hard drive. The large is 64 bits
>> > and have two hard drive.
>> >
>> > Today I have 300GB of index which is distributed in three machines
>> > that each machine has 6 15k rpm hard drive.
>> > And doing this study was to migrate to the Amazon. So I doubt whether
>> > this is best 4 small or 1 large.
>> >
>> > My question would be to build a cluster of 40 large or 15 small
>> > instance instance. I need to search back in less than 200ms on
>> > average.
>> > Is it possible to do this using elasticsearch at amazon?
>> >
>> >
>> > Thanks for all. Elasticsearch is great project.
>> >
>> > 2011/10/12 Shay Banon <[hidden email]>
>> >
>> > Large instances are preferable, but, do you mean 1 large instance Vs. 4
>> > small instances?
>> >
>> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> > wrote:
>> >
>> > for better peformance is better I use 4 small instance or use a large
>> >  in
>> > the amazon cloud?
>> > Small instance is 32 bits with one hard drive instance, and large have 2
>> > hard drive instance and is 64 bits.
>> >
>> >
>> > --
>> > Gustavo Maia
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>
>



--
Gustavo Maia
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

kimchy
Administrator
Heya,

   Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more memory). I suggest going with the m1.xlarge as more memory tend to outweigh faster CPU.

   Regarding the drives. the new option to specify multiple data locations does not depend on the number of shards. In other words, even a singel shard allocated on a node will make use of all the data locations.

-shay.banon

On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]> wrote:
Thank you.

I'm thinking of using 10 instances c1.xlarge. Each instance
(c1.xlarge) has 4 hds.

So, I would use the appropriate version of the v0.18 elasticsearch,
which allows  configure 4 hds in the same elasticsearch process. Is
that best ?

Would have any peformance problem, if I have each shard of size 15GB,
thinking that in each instance would have 4 shards, one per HD?


2011/10/14 Shay Banon <[hidden email]>:
> In general, I suggest using the xlarge instances in Amazon, simply because
> of the higher IO they provide and better performance consistency (at least
> based on what users have seen).
>
> On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
> wrote:
>>
>> Hi Pavel,
>> Thanks for all.
>>
>> How many large instance do you have ?
>>
>> 2011/10/13 Pavel Penchev <[hidden email]>:
>> > Hi,
>> >
>> > we have similar requirements and we decided to go for the large
>> > instances.
>> > The search times were ok on the small instances (90% below 200ms) but
>> > the
>> > indexing suffered significantly (only 30% below 200ms, we have
>> > requirements
>> > for indexing as well). In comparison the large instances handle both
>> > search
>> > and indexing with 95% below 200ms.
>> >
>> > Bear in mind this is specific to the type of documents you have and the
>> > searches you perform. Go for a 24h test I'd suggest.
>> >
>> > Regards,
>> > Pavel
>> >
>> >
>> > On <a href="tel:13.10.2011" value="+13102011">13.10.2011 01:26, Gustavo Maia wrote:
>> >
>> > correct:
>> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> > instance instance. I need to search back in less than 200ms on
>> > average.
>> >
>> >
>> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >
>> > First thank you for your attention.
>> >
>> > In the amazon price of 4 small is the same price of one large.
>> > The small is 32 bit and have only one hard drive. The large is 64 bits
>> > and have two hard drive.
>> >
>> > Today I have 300GB of index which is distributed in three machines
>> > that each machine has 6 15k rpm hard drive.
>> > And doing this study was to migrate to the Amazon. So I doubt whether
>> > this is best 4 small or 1 large.
>> >
>> > My question would be to build a cluster of 40 large or 15 small
>> > instance instance. I need to search back in less than 200ms on
>> > average.
>> > Is it possible to do this using elasticsearch at amazon?
>> >
>> >
>> > Thanks for all. Elasticsearch is great project.
>> >
>> > 2011/10/12 Shay Banon <[hidden email]>
>> >
>> > Large instances are preferable, but, do you mean 1 large instance Vs. 4
>> > small instances?
>> >
>> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> > wrote:
>> >
>> > for better peformance is better I use 4 small instance or use a large
>> >  in
>> > the amazon cloud?
>> > Small instance is 32 bits with one hard drive instance, and large have 2
>> > hard drive instance and is 64 bits.
>> >
>> >
>> > --
>> > Gustavo Maia
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>
>



--
Gustavo Maia

Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia

So, I'd better install the machine (m1.xlarge) 4 ES, an ES for each point data to different HD. It should be better because at the time of the search are going to be in parallel searches using 4 hds, with different processors.
I set up each instance of ES with 3GB of ram.

If I use the machine (m1.large) I would install only 2 ES, one for each allocate the same HD and 3GB of ram for each ES.

is it?



****** m1.xlarge Config

15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge


****** m1.large Config:

7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.large


//###################################################################

2011/10/17 Shay Banon <[hidden email]>:
> Heya,
>    Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more
> memory). I suggest going with the m1.xlarge as more memory tend
> to outweigh faster CPU.
>    Regarding the drives. the new option to specify multiple data locations
> does not depend on the number of shards. In other words, even a singel shard
> allocated on a node will make use of all the data locations.
> -shay.banon
>
> On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]>
> wrote:
>>
>> Thank you.
>>
>> I'm thinking of using 10 instances c1.xlarge. Each instance
>> (c1.xlarge) has 4 hds.
>>
>> So, I would use the appropriate version of the v0.18 elasticsearch,
>> which allows  configure 4 hds in the same elasticsearch process. Is
>> that best ?
>>
>> Would have any peformance problem, if I have each shard of size 15GB,
>> thinking that in each instance would have 4 shards, one per HD?
>>
>>
>> 2011/10/14 Shay Banon <[hidden email]>:
>> > In general, I suggest using the xlarge instances in Amazon, simply
>> > because
>> > of the higher IO they provide and better performance consistency (at
>> > least
>> > based on what users have seen).
>> >
>> > On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
>> > wrote:
>> >>
>> >> Hi Pavel,
>> >> Thanks for all.
>> >>
>> >> How many large instance do you have ?
>> >>
>> >> 2011/10/13 Pavel Penchev <[hidden email]>:
>> >> > Hi,
>> >> >
>> >> > we have similar requirements and we decided to go for the large
>> >> > instances.
>> >> > The search times were ok on the small instances (90% below 200ms) but
>> >> > the
>> >> > indexing suffered significantly (only 30% below 200ms, we have
>> >> > requirements
>> >> > for indexing as well). In comparison the large instances handle both
>> >> > search
>> >> > and indexing with 95% below 200ms.
>> >> >
>> >> > Bear in mind this is specific to the type of documents you have and
>> >> > the
>> >> > searches you perform. Go for a 24h test I'd suggest.
>> >> >
>> >> > Regards,
>> >> > Pavel
>> >> >
>> >> >
>> >> > On 13.10.2011 01:26, Gustavo Maia wrote:
>> >> >
>> >> > correct:
>> >> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> >> > instance instance. I need to search back in less than 200ms on
>> >> > average.
>> >> >
>> >> >
>> >> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >> >
>> >> > First thank you for your attention.
>> >> >
>> >> > In the amazon price of 4 small is the same price of one large.
>> >> > The small is 32 bit and have only one hard drive. The large is 64
>> >> > bits
>> >> > and have two hard drive.
>> >> >
>> >> > Today I have 300GB of index which is distributed in three machines
>> >> > that each machine has 6 15k rpm hard drive.
>> >> > And doing this study was to migrate to the Amazon. So I doubt whether
>> >> > this is best 4 small or 1 large.
>> >> >
>> >> > My question would be to build a cluster of 40 large or 15 small
>> >> > instance instance. I need to search back in less than 200ms on
>> >> > average.
>> >> > Is it possible to do this using elasticsearch at amazon?
>> >> >
>> >> >
>> >> > Thanks for all. Elasticsearch is great project.
>> >> >
>> >> > 2011/10/12 Shay Banon <[hidden email]>
>> >> >
>> >> > Large instances are preferable, but, do you mean 1 large instance Vs.
>> >> > 4
>> >> > small instances?
>> >> >
>> >> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> >> > wrote:
>> >> >
>> >> > for better peformance is better I use 4 small instance or use a large
>> >> >  in
>> >> > the amazon cloud?
>> >> > Small instance is 32 bits with one hard drive instance, and large
>> >> > have 2
>> >> > hard drive instance and is 64 bits.
>> >> >
>> >> >
>> >> > --
>> >> > Gustavo Maia
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >>

>> >> --
>> >> Gustavo Maia
>> >
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>
>



--
Gustavo Maia

Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

kimchy
Administrator
You mean start 3 ES processes on the same machine? why?

On Tue, Oct 18, 2011 at 10:51 PM, Gustavo Maia <[hidden email]> wrote:

So, I'd better install the machine (m1.xlarge) 4 ES, an ES for each point data to different HD. It should be better because at the time of the search are going to be in parallel searches using 4 hds, with different processors.
I set up each instance of ES with 3GB of ram.

If I use the machine (m1.large) I would install only 2 ES, one for each allocate the same HD and 3GB of ram for each ES.

is it?



****** m1.xlarge Config

15 GB memory
8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
1,690 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.xlarge


****** m1.large Config:

7.5 GB memory
4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
850 GB instance storage
64-bit platform
I/O Performance: High
API name: m1.large


//###################################################################

2011/10/17 Shay Banon <[hidden email]>:

> Heya,
>    Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more
> memory). I suggest going with the m1.xlarge as more memory tend
> to outweigh faster CPU.
>    Regarding the drives. the new option to specify multiple data locations
> does not depend on the number of shards. In other words, even a singel shard
> allocated on a node will make use of all the data locations.
> -shay.banon
>
> On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]>
> wrote:
>>
>> Thank you.
>>
>> I'm thinking of using 10 instances c1.xlarge. Each instance
>> (c1.xlarge) has 4 hds.
>>
>> So, I would use the appropriate version of the v0.18 elasticsearch,
>> which allows  configure 4 hds in the same elasticsearch process. Is
>> that best ?
>>
>> Would have any peformance problem, if I have each shard of size 15GB,
>> thinking that in each instance would have 4 shards, one per HD?
>>
>>
>> 2011/10/14 Shay Banon <[hidden email]>:
>> > In general, I suggest using the xlarge instances in Amazon, simply
>> > because
>> > of the higher IO they provide and better performance consistency (at
>> > least
>> > based on what users have seen).
>> >
>> > On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
>> > wrote:
>> >>
>> >> Hi Pavel,
>> >> Thanks for all.
>> >>
>> >> How many large instance do you have ?
>> >>
>> >> 2011/10/13 Pavel Penchev <[hidden email]>:
>> >> > Hi,
>> >> >
>> >> > we have similar requirements and we decided to go for the large
>> >> > instances.
>> >> > The search times were ok on the small instances (90% below 200ms) but
>> >> > the
>> >> > indexing suffered significantly (only 30% below 200ms, we have
>> >> > requirements
>> >> > for indexing as well). In comparison the large instances handle both
>> >> > search
>> >> > and indexing with 95% below 200ms.
>> >> >
>> >> > Bear in mind this is specific to the type of documents you have and
>> >> > the
>> >> > searches you perform. Go for a 24h test I'd suggest.
>> >> >
>> >> > Regards,
>> >> > Pavel
>> >> >
>> >> >
>> >> > On <a href="tel:13.10.2011" value="+13102011" target="_blank">13.10.2011 01:26, Gustavo Maia wrote:
>> >> >
>> >> > correct:
>> >> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> >> > instance instance. I need to search back in less than 200ms on
>> >> > average.
>> >> >
>> >> >
>> >> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >> >
>> >> > First thank you for your attention.
>> >> >
>> >> > In the amazon price of 4 small is the same price of one large.
>> >> > The small is 32 bit and have only one hard drive. The large is 64
>> >> > bits
>> >> > and have two hard drive.
>> >> >
>> >> > Today I have 300GB of index which is distributed in three machines
>> >> > that each machine has 6 15k rpm hard drive.
>> >> > And doing this study was to migrate to the Amazon. So I doubt whether
>> >> > this is best 4 small or 1 large.
>> >> >
>> >> > My question would be to build a cluster of 40 large or 15 small
>> >> > instance instance. I need to search back in less than 200ms on
>> >> > average.
>> >> > Is it possible to do this using elasticsearch at amazon?
>> >> >
>> >> >
>> >> > Thanks for all. Elasticsearch is great project.
>> >> >
>> >> > 2011/10/12 Shay Banon <[hidden email]>
>> >> >
>> >> > Large instances are preferable, but, do you mean 1 large instance Vs.
>> >> > 4
>> >> > small instances?
>> >> >
>> >> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> >> > wrote:
>> >> >
>> >> > for better peformance is better I use 4 small instance or use a large
>> >> >  in
>> >> > the amazon cloud?
>> >> > Small instance is 32 bits with one hard drive instance, and large
>> >> > have 2
>> >> > hard drive instance and is 64 bits.
>> >> >
>> >> >
>> >> > --
>> >> > Gustavo Maia
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >>

>> >> --
>> >> Gustavo Maia
>> >
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>
>



--
Gustavo Maia


Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia
Yes,  Is not it better?

For my experimenting with the lucene, is better distribute the load between the drives. Using an ES for each hard drive, I guarantee a better distribution between HD. Ex: One shard of 15GB per HD. During the seach i will have better parallelism since I have one HD and one processor for a specific search.

ex: When the User do a search we have the parallel processing of 4 hds and 4 processors, ensuring a faster response, since it was set up only one shard by ES.



2011/10/18 Shay Banon <[hidden email]>
>
> You mean start 3 ES processes on the same machine? why?
>
> On Tue, Oct 18, 2011 at 10:51 PM, Gustavo Maia <[hidden email]> wrote:
>>
>> So, I'd better install the machine (m1.xlarge) 4 ES, an ES for each point data to different HD. It should be better because at the time of the search are going to be in parallel searches using 4 hds, with different processors.
>> I set up each instance of ES with 3GB of ram.
>>
>> If I use the machine (m1.large) I would install only 2 ES, one for each allocate the same HD and 3GB of ram for each ES.
>>
>> is it?
>>
>>
>>
>> ****** m1.xlarge Config
>>
>> 15 GB memory
>> 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
>> 1,690 GB instance storage
>> 64-bit platform
>> I/O Performance: High
>> API name: m1.xlarge
>>
>> ****** m1.large Config:
>>
>> 7.5 GB memory
>> 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
>> 850 GB instance storage
>> 64-bit platform
>> I/O Performance: High
>> API name: m1.large
>>
>> //###################################################################
>>
>> 2011/10/17 Shay Banon <[hidden email]>:
>> > Heya,
>> >    Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more
>> > memory). I suggest going with the m1.xlarge as more memory tend
>> > to outweigh faster CPU.
>> >    Regarding the drives. the new option to specify multiple data locations
>> > does not depend on the number of shards. In other words, even a singel shard
>> > allocated on a node will make use of all the data locations.
>> > -shay.banon
>> >
>> > On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]>
>> > wrote:
>> >>
>> >> Thank you.
>> >>
>> >> I'm thinking of using 10 instances c1.xlarge. Each instance
>> >> (c1.xlarge) has 4 hds.
>> >>
>> >> So, I would use the appropriate version of the v0.18 elasticsearch,
>> >> which allows  configure 4 hds in the same elasticsearch process. Is
>> >> that best ?
>> >>
>> >> Would have any peformance problem, if I have each shard of size 15GB,
>> >> thinking that in each instance would have 4 shards, one per HD?
>> >>
>> >>
>> >> 2011/10/14 Shay Banon <[hidden email]>:
>> >> > In general, I suggest using the xlarge instances in Amazon, simply
>> >> > because
>> >> > of the higher IO they provide and better performance consistency (at
>> >> > least
>> >> > based on what users have seen).
>> >> >
>> >> > On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
>> >> > wrote:
>> >> >>
>> >> >> Hi Pavel,
>> >> >> Thanks for all.
>> >> >>
>> >> >> How many large instance do you have ?
>> >> >>
>> >> >> 2011/10/13 Pavel Penchev <[hidden email]>:
>> >> >> > Hi,
>> >> >> >
>> >> >> > we have similar requirements and we decided to go for the large
>> >> >> > instances.
>> >> >> > The search times were ok on the small instances (90% below 200ms) but
>> >> >> > the
>> >> >> > indexing suffered significantly (only 30% below 200ms, we have
>> >> >> > requirements
>> >> >> > for indexing as well). In comparison the large instances handle both
>> >> >> > search
>> >> >> > and indexing with 95% below 200ms.
>> >> >> >
>> >> >> > Bear in mind this is specific to the type of documents you have and
>> >> >> > the
>> >> >> > searches you perform. Go for a 24h test I'd suggest.
>> >> >> >
>> >> >> > Regards,
>> >> >> > Pavel
>> >> >> >
>> >> >> >
>> >> >> > On 13.10.2011 01:26, Gustavo Maia wrote:
>> >> >> >
>> >> >> > correct:
>> >> >> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> > average.
>> >> >> >
>> >> >> >
>> >> >> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >> >> >
>> >> >> > First thank you for your attention.
>> >> >> >
>> >> >> > In the amazon price of 4 small is the same price of one large.
>> >> >> > The small is 32 bit and have only one hard drive. The large is 64
>> >> >> > bits
>> >> >> > and have two hard drive.
>> >> >> >
>> >> >> > Today I have 300GB of index which is distributed in three machines
>> >> >> > that each machine has 6 15k rpm hard drive.
>> >> >> > And doing this study was to migrate to the Amazon. So I doubt whether
>> >> >> > this is best 4 small or 1 large.
>> >> >> >
>> >> >> > My question would be to build a cluster of 40 large or 15 small
>> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> > average.
>> >> >> > Is it possible to do this using elasticsearch at amazon?
>> >> >> >
>> >> >> >
>> >> >> > Thanks for all. Elasticsearch is great project.
>> >> >> >
>> >> >> > 2011/10/12 Shay Banon <[hidden email]>
>> >> >> >
>> >> >> > Large instances are preferable, but, do you mean 1 large instance Vs.
>> >> >> > 4
>> >> >> > small instances?
>> >> >> >
>> >> >> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> >> >> > wrote:
>> >> >> >
>> >> >> > for better peformance is better I use 4 small instance or use a large
>> >> >> >  in
>> >> >> > the amazon cloud?
>> >> >> > Small instance is 32 bits with one hard drive instance, and large
>> >> >> > have 2
>> >> >> > hard drive instance and is 64 bits.
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Gustavo Maia
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Gustavo Maia
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Gustavo Maia
>> >
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>>
>



--
Gustavo Maia
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

kimchy
Administrator
In master version, you can specify several data locations so a single instance can use several drives, I thought you were referring to that in your previous mail.

On Tue, Oct 18, 2011 at 11:22 PM, Gustavo Maia <[hidden email]> wrote:
Yes,  Is not it better?

For my experimenting with the lucene, is better distribute the load between the drives. Using an ES for each hard drive, I guarantee a better distribution between HD. Ex: One shard of 15GB per HD. During the seach i will have better parallelism since I have one HD and one processor for a specific search.

ex: When the User do a search we have the parallel processing of 4 hds and 4 processors, ensuring a faster response, since it was set up only one shard by ES.




2011/10/18 Shay Banon <[hidden email]>
>
> You mean start 3 ES processes on the same machine? why?
>
> On Tue, Oct 18, 2011 at 10:51 PM, Gustavo Maia <[hidden email]> wrote:
>>
>> So, I'd better install the machine (m1.xlarge) 4 ES, an ES for each point data to different HD. It should be better because at the time of the search are going to be in parallel searches using 4 hds, with different processors.
>> I set up each instance of ES with 3GB of ram.
>>
>> If I use the machine (m1.large) I would install only 2 ES, one for each allocate the same HD and 3GB of ram for each ES.
>>
>> is it?
>>
>>
>>
>> ****** m1.xlarge Config
>>
>> 15 GB memory
>> 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
>> 1,690 GB instance storage
>> 64-bit platform
>> I/O Performance: High
>> API name: m1.xlarge
>>
>> ****** m1.large Config:
>>
>> 7.5 GB memory
>> 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
>> 850 GB instance storage
>> 64-bit platform
>> I/O Performance: High
>> API name: m1.large
>>
>> //###################################################################
>>
>> 2011/10/17 Shay Banon <[hidden email]>:
>> > Heya,
>> >    Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more
>> > memory). I suggest going with the m1.xlarge as more memory tend
>> > to outweigh faster CPU.
>> >    Regarding the drives. the new option to specify multiple data locations
>> > does not depend on the number of shards. In other words, even a singel shard
>> > allocated on a node will make use of all the data locations.
>> > -shay.banon
>> >
>> > On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]>
>> > wrote:
>> >>
>> >> Thank you.
>> >>
>> >> I'm thinking of using 10 instances c1.xlarge. Each instance
>> >> (c1.xlarge) has 4 hds.
>> >>
>> >> So, I would use the appropriate version of the v0.18 elasticsearch,
>> >> which allows  configure 4 hds in the same elasticsearch process. Is
>> >> that best ?
>> >>
>> >> Would have any peformance problem, if I have each shard of size 15GB,
>> >> thinking that in each instance would have 4 shards, one per HD?
>> >>
>> >>
>> >> 2011/10/14 Shay Banon <[hidden email]>:
>> >> > In general, I suggest using the xlarge instances in Amazon, simply
>> >> > because
>> >> > of the higher IO they provide and better performance consistency (at
>> >> > least
>> >> > based on what users have seen).
>> >> >
>> >> > On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
>> >> > wrote:
>> >> >>
>> >> >> Hi Pavel,
>> >> >> Thanks for all.
>> >> >>
>> >> >> How many large instance do you have ?
>> >> >>
>> >> >> 2011/10/13 Pavel Penchev <[hidden email]>:
>> >> >> > Hi,
>> >> >> >
>> >> >> > we have similar requirements and we decided to go for the large
>> >> >> > instances.
>> >> >> > The search times were ok on the small instances (90% below 200ms) but
>> >> >> > the
>> >> >> > indexing suffered significantly (only 30% below 200ms, we have
>> >> >> > requirements
>> >> >> > for indexing as well). In comparison the large instances handle both
>> >> >> > search
>> >> >> > and indexing with 95% below 200ms.
>> >> >> >
>> >> >> > Bear in mind this is specific to the type of documents you have and
>> >> >> > the
>> >> >> > searches you perform. Go for a 24h test I'd suggest.
>> >> >> >
>> >> >> > Regards,
>> >> >> > Pavel
>> >> >> >
>> >> >> >
>> >> >> > On <a href="tel:13.10.2011" value="+13102011" target="_blank">13.10.2011 01:26, Gustavo Maia wrote:
>> >> >> >
>> >> >> > correct:
>> >> >> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> > average.
>> >> >> >
>> >> >> >
>> >> >> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >> >> >
>> >> >> > First thank you for your attention.
>> >> >> >
>> >> >> > In the amazon price of 4 small is the same price of one large.
>> >> >> > The small is 32 bit and have only one hard drive. The large is 64
>> >> >> > bits
>> >> >> > and have two hard drive.
>> >> >> >
>> >> >> > Today I have 300GB of index which is distributed in three machines
>> >> >> > that each machine has 6 15k rpm hard drive.
>> >> >> > And doing this study was to migrate to the Amazon. So I doubt whether
>> >> >> > this is best 4 small or 1 large.
>> >> >> >
>> >> >> > My question would be to build a cluster of 40 large or 15 small
>> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> > average.
>> >> >> > Is it possible to do this using elasticsearch at amazon?
>> >> >> >
>> >> >> >
>> >> >> > Thanks for all. Elasticsearch is great project.
>> >> >> >
>> >> >> > 2011/10/12 Shay Banon <[hidden email]>
>> >> >> >
>> >> >> > Large instances are preferable, but, do you mean 1 large instance Vs.
>> >> >> > 4
>> >> >> > small instances?
>> >> >> >
>> >> >> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> >> >> > wrote:
>> >> >> >
>> >> >> > for better peformance is better I use 4 small instance or use a large
>> >> >> >  in
>> >> >> > the amazon cloud?
>> >> >> > Small instance is 32 bits with one hard drive instance, and large
>> >> >> > have 2
>> >> >> > hard drive instance and is 64 bits.
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Gustavo Maia
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Gustavo Maia
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Gustavo Maia
>> >
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>>
>



--
Gustavo Maia

Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

Gustavo Maia

Using the version of the master, if I set 4 shard, I guarantee that I will have a shard in each hd?
Like if i have four shard, each shard of 15GB, I guarantee you'll have one shard of 15GB in each HD?


2011/10/18 Shay Banon <[hidden email]>
>
> In master version, you can specify several data locations so a single instance can use several drives, I thought you were referring to that in your previous mail.
>
> On Tue, Oct 18, 2011 at 11:22 PM, Gustavo Maia <[hidden email]> wrote:
>>
>> Yes,  Is not it better?
>>
>> For my experimenting with the lucene, is better distribute the load between the drives. Using an ES for each hard drive, I guarantee a better distribution between HD. Ex: One shard of 15GB per HD. During the seach i will have better parallelism since I have one HD and one processor for a specific search.
>>
>> ex: When the User do a search we have the parallel processing of 4 hds and 4 processors, ensuring a faster response, since it was set up only one shard by ES.
>>
>>
>>
>> 2011/10/18 Shay Banon <[hidden email]>
>> >
>> > You mean start 3 ES processes on the same machine? why?
>> >
>> > On Tue, Oct 18, 2011 at 10:51 PM, Gustavo Maia <[hidden email]> wrote:
>> >>
>> >> So, I'd better install the machine (m1.xlarge) 4 ES, an ES for each point data to different HD. It should be better because at the time of the search are going to be in parallel searches using 4 hds, with different processors.
>> >> I set up each instance of ES with 3GB of ram.
>> >>
>> >> If I use the machine (m1.large) I would install only 2 ES, one for each allocate the same HD and 3GB of ram for each ES.
>> >>
>> >> is it?
>> >>
>> >>
>> >>
>> >> ****** m1.xlarge Config
>> >>
>> >> 15 GB memory
>> >> 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
>> >> 1,690 GB instance storage
>> >> 64-bit platform
>> >> I/O Performance: High
>> >> API name: m1.xlarge
>> >>
>> >> ****** m1.large Config:
>> >>
>> >> 7.5 GB memory
>> >> 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
>> >> 850 GB instance storage
>> >> 64-bit platform
>> >> I/O Performance: High
>> >> API name: m1.large
>> >>
>> >> //###################################################################
>> >>
>> >> 2011/10/17 Shay Banon <[hidden email]>:
>> >> > Heya,
>> >> >    Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more
>> >> > memory). I suggest going with the m1.xlarge as more memory tend
>> >> > to outweigh faster CPU.
>> >> >    Regarding the drives. the new option to specify multiple data locations
>> >> > does not depend on the number of shards. In other words, even a singel shard
>> >> > allocated on a node will make use of all the data locations.
>> >> > -shay.banon
>> >> >
>> >> > On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]>
>> >> > wrote:
>> >> >>
>> >> >> Thank you.
>> >> >>
>> >> >> I'm thinking of using 10 instances c1.xlarge. Each instance
>> >> >> (c1.xlarge) has 4 hds.
>> >> >>
>> >> >> So, I would use the appropriate version of the v0.18 elasticsearch,
>> >> >> which allows  configure 4 hds in the same elasticsearch process. Is
>> >> >> that best ?
>> >> >>
>> >> >> Would have any peformance problem, if I have each shard of size 15GB,
>> >> >> thinking that in each instance would have 4 shards, one per HD?
>> >> >>
>> >> >>
>> >> >> 2011/10/14 Shay Banon <[hidden email]>:
>> >> >> > In general, I suggest using the xlarge instances in Amazon, simply
>> >> >> > because
>> >> >> > of the higher IO they provide and better performance consistency (at
>> >> >> > least
>> >> >> > based on what users have seen).
>> >> >> >
>> >> >> > On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Hi Pavel,
>> >> >> >> Thanks for all.
>> >> >> >>
>> >> >> >> How many large instance do you have ?
>> >> >> >>
>> >> >> >> 2011/10/13 Pavel Penchev <[hidden email]>:
>> >> >> >> > Hi,
>> >> >> >> >
>> >> >> >> > we have similar requirements and we decided to go for the large
>> >> >> >> > instances.
>> >> >> >> > The search times were ok on the small instances (90% below 200ms) but
>> >> >> >> > the
>> >> >> >> > indexing suffered significantly (only 30% below 200ms, we have
>> >> >> >> > requirements
>> >> >> >> > for indexing as well). In comparison the large instances handle both
>> >> >> >> > search
>> >> >> >> > and indexing with 95% below 200ms.
>> >> >> >> >
>> >> >> >> > Bear in mind this is specific to the type of documents you have and
>> >> >> >> > the
>> >> >> >> > searches you perform. Go for a 24h test I'd suggest.
>> >> >> >> >
>> >> >> >> > Regards,
>> >> >> >> > Pavel
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On 13.10.2011 01:26, Gustavo Maia wrote:
>> >> >> >> >
>> >> >> >> > correct:
>> >> >> >> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> >> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> >> > average.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >> >> >> >
>> >> >> >> > First thank you for your attention.
>> >> >> >> >
>> >> >> >> > In the amazon price of 4 small is the same price of one large.
>> >> >> >> > The small is 32 bit and have only one hard drive. The large is 64
>> >> >> >> > bits
>> >> >> >> > and have two hard drive.
>> >> >> >> >
>> >> >> >> > Today I have 300GB of index which is distributed in three machines
>> >> >> >> > that each machine has 6 15k rpm hard drive.
>> >> >> >> > And doing this study was to migrate to the Amazon. So I doubt whether
>> >> >> >> > this is best 4 small or 1 large.
>> >> >> >> >
>> >> >> >> > My question would be to build a cluster of 40 large or 15 small
>> >> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> >> > average.
>> >> >> >> > Is it possible to do this using elasticsearch at amazon?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Thanks for all. Elasticsearch is great project.
>> >> >> >> >
>> >> >> >> > 2011/10/12 Shay Banon <[hidden email]>
>> >> >> >> >
>> >> >> >> > Large instances are preferable, but, do you mean 1 large instance Vs.
>> >> >> >> > 4
>> >> >> >> > small instances?
>> >> >> >> >
>> >> >> >> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> >> >> >> > wrote:
>> >> >> >> >
>> >> >> >> > for better peformance is better I use 4 small instance or use a large
>> >> >> >> >  in
>> >> >> >> > the amazon cloud?
>> >> >> >> > Small instance is 32 bits with one hard drive instance, and large
>> >> >> >> > have 2
>> >> >> >> > hard drive instance and is 64 bits.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Gustavo Maia
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Gustavo Maia
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Gustavo Maia
>> >> >
>> >> >
>> >>
>> >>

>> >>
>> >> --
>> >> Gustavo Maia
>> >>
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>



--
Gustavo Maia
Reply | Threaded
Open this post in threaded view
|

Re: Ec2 peformance

kimchy
Administrator
No, as I explained before, using multi drives does not mean that each shard will be on a single drive, it means that the files composing the Lucene index will exist on different drives, so a single shard will span all drives potentially.

On Tue, Oct 18, 2011 at 11:38 PM, Gustavo Maia <[hidden email]> wrote:

Using the version of the master, if I set 4 shard, I guarantee that I will have a shard in each hd?
Like if i have four shard, each shard of 15GB, I guarantee you'll have one shard of 15GB in each HD?



2011/10/18 Shay Banon <[hidden email]>
>
> In master version, you can specify several data locations so a single instance can use several drives, I thought you were referring to that in your previous mail.
>
> On Tue, Oct 18, 2011 at 11:22 PM, Gustavo Maia <[hidden email]> wrote:
>>
>> Yes,  Is not it better?
>>
>> For my experimenting with the lucene, is better distribute the load between the drives. Using an ES for each hard drive, I guarantee a better distribution between HD. Ex: One shard of 15GB per HD. During the seach i will have better parallelism since I have one HD and one processor for a specific search.
>>
>> ex: When the User do a search we have the parallel processing of 4 hds and 4 processors, ensuring a faster response, since it was set up only one shard by ES.
>>
>>
>>
>> 2011/10/18 Shay Banon <[hidden email]>
>> >
>> > You mean start 3 ES processes on the same machine? why?
>> >
>> > On Tue, Oct 18, 2011 at 10:51 PM, Gustavo Maia <[hidden email]> wrote:
>> >>
>> >> So, I'd better install the machine (m1.xlarge) 4 ES, an ES for each point data to different HD. It should be better because at the time of the search are going to be in parallel searches using 4 hds, with different processors.
>> >> I set up each instance of ES with 3GB of ram.
>> >>
>> >> If I use the machine (m1.large) I would install only 2 ES, one for each allocate the same HD and 3GB of ram for each ES.
>> >>
>> >> is it?
>> >>
>> >>
>> >>
>> >> ****** m1.xlarge Config
>> >>
>> >> 15 GB memory
>> >> 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)
>> >> 1,690 GB instance storage
>> >> 64-bit platform
>> >> I/O Performance: High
>> >> API name: m1.xlarge
>> >>
>> >> ****** m1.large Config:
>> >>
>> >> 7.5 GB memory
>> >> 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
>> >> 850 GB instance storage
>> >> 64-bit platform
>> >> I/O Performance: High
>> >> API name: m1.large
>> >>
>> >> //###################################################################
>> >>
>> >> 2011/10/17 Shay Banon <[hidden email]>:
>> >> > Heya,
>> >> >    Its tricky to choose between c1.xlarge (more CPU) and m1.xlarge (more
>> >> > memory). I suggest going with the m1.xlarge as more memory tend
>> >> > to outweigh faster CPU.
>> >> >    Regarding the drives. the new option to specify multiple data locations
>> >> > does not depend on the number of shards. In other words, even a singel shard
>> >> > allocated on a node will make use of all the data locations.
>> >> > -shay.banon
>> >> >
>> >> > On Mon, Oct 17, 2011 at 7:57 PM, Gustavo Maia <[hidden email]>
>> >> > wrote:
>> >> >>
>> >> >> Thank you.
>> >> >>
>> >> >> I'm thinking of using 10 instances c1.xlarge. Each instance
>> >> >> (c1.xlarge) has 4 hds.
>> >> >>
>> >> >> So, I would use the appropriate version of the v0.18 elasticsearch,
>> >> >> which allows  configure 4 hds in the same elasticsearch process. Is
>> >> >> that best ?
>> >> >>
>> >> >> Would have any peformance problem, if I have each shard of size 15GB,
>> >> >> thinking that in each instance would have 4 shards, one per HD?
>> >> >>
>> >> >>
>> >> >> 2011/10/14 Shay Banon <[hidden email]>:
>> >> >> > In general, I suggest using the xlarge instances in Amazon, simply
>> >> >> > because
>> >> >> > of the higher IO they provide and better performance consistency (at
>> >> >> > least
>> >> >> > based on what users have seen).
>> >> >> >
>> >> >> > On Thu, Oct 13, 2011 at 3:09 PM, Gustavo Maia <[hidden email]>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Hi Pavel,
>> >> >> >> Thanks for all.
>> >> >> >>
>> >> >> >> How many large instance do you have ?
>> >> >> >>
>> >> >> >> 2011/10/13 Pavel Penchev <[hidden email]>:
>> >> >> >> > Hi,
>> >> >> >> >
>> >> >> >> > we have similar requirements and we decided to go for the large
>> >> >> >> > instances.
>> >> >> >> > The search times were ok on the small instances (90% below 200ms) but
>> >> >> >> > the
>> >> >> >> > indexing suffered significantly (only 30% below 200ms, we have
>> >> >> >> > requirements
>> >> >> >> > for indexing as well). In comparison the large instances handle both
>> >> >> >> > search
>> >> >> >> > and indexing with 95% below 200ms.
>> >> >> >> >
>> >> >> >> > Bear in mind this is specific to the type of documents you have and
>> >> >> >> > the
>> >> >> >> > searches you perform. Go for a 24h test I'd suggest.
>> >> >> >> >
>> >> >> >> > Regards,
>> >> >> >> > Pavel
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On <a href="tel:13.10.2011" value="+13102011" target="_blank">13.10.2011 01:26, Gustavo Maia wrote:
>> >> >> >> >
>> >> >> >> > correct:
>> >> >> >> > My question would be to build a cluster of 40 SMALL or 15 LARGE
>> >> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> >> > average.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > 2011/10/12 Gustavo Maia <[hidden email]>:
>> >> >> >> >
>> >> >> >> > First thank you for your attention.
>> >> >> >> >
>> >> >> >> > In the amazon price of 4 small is the same price of one large.
>> >> >> >> > The small is 32 bit and have only one hard drive. The large is 64
>> >> >> >> > bits
>> >> >> >> > and have two hard drive.
>> >> >> >> >
>> >> >> >> > Today I have 300GB of index which is distributed in three machines
>> >> >> >> > that each machine has 6 15k rpm hard drive.
>> >> >> >> > And doing this study was to migrate to the Amazon. So I doubt whether
>> >> >> >> > this is best 4 small or 1 large.
>> >> >> >> >
>> >> >> >> > My question would be to build a cluster of 40 large or 15 small
>> >> >> >> > instance instance. I need to search back in less than 200ms on
>> >> >> >> > average.
>> >> >> >> > Is it possible to do this using elasticsearch at amazon?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Thanks for all. Elasticsearch is great project.
>> >> >> >> >
>> >> >> >> > 2011/10/12 Shay Banon <[hidden email]>
>> >> >> >> >
>> >> >> >> > Large instances are preferable, but, do you mean 1 large instance Vs.
>> >> >> >> > 4
>> >> >> >> > small instances?
>> >> >> >> >
>> >> >> >> > On Sat, Oct 8, 2011 at 11:16 PM, Gustavo <[hidden email]>
>> >> >> >> > wrote:
>> >> >> >> >
>> >> >> >> > for better peformance is better I use 4 small instance or use a large
>> >> >> >> >  in
>> >> >> >> > the amazon cloud?
>> >> >> >> > Small instance is 32 bits with one hard drive instance, and large
>> >> >> >> > have 2
>> >> >> >> > hard drive instance and is 64 bits.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Gustavo Maia
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Gustavo Maia
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Gustavo Maia
>> >> >
>> >> >
>> >>
>> >>

>> >>
>> >> --
>> >> Gustavo Maia
>> >>
>> >
>>
>>
>>
>> --
>> Gustavo Maia
>



--
Gustavo Maia