Quantcast

Elasticsearch client only cluster

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Elasticsearch client only cluster

beharris
I'm using elasticsearch for logstash and I would like to create a client only (30 node)cluster with 0 shards are distributed and no replication.
I have tons of logs so have configured logging per rack and I want to be able to connect to any es instance to search all the cluster nodes.
This setup has worked but when nodes get restarted it seems to affect the cluster health which causes ":exception=>org.elasticsearch.action.UnavailableShardsException".

My current setting per es instances is
unicast discovery
shards=5
replication=0

current health
{
  "cluster_name" : "elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 52,
  "number_of_data_nodes" : 26,
  "active_primary_shards" : 22,
  "active_shards" : 22,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 3
}

If I lose a few nodes I don't want it to affect the rest from starting up or processing correctly.
Am I missing any settings for a client only cluster?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

Clinton Gormley-2
On Sun, 2013-02-24 at 12:44 -0800, [hidden email] wrote:
> I'm using elasticsearch for logstash and I would like to create a
> client only (30 node)cluster with 0 shards are distributed and no
> replication.

You need to have shards to index data.  Shards are not just for
replication.  Without any shards, you can't index any data.

Do you mean that you don't want any replica shards? You can control the
number of primary shards that an index contains when you create the
index, and the number_of_replicas can be updated at any time.

clint


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris
Sorry I meant I would like to create a 30 node cluster one per rack and ensure that the 1 shard stays within the rack.
Is that possible within elasticsearch?

The cluster would be only used o search one instance and look at all data without having data get replicated or sharded between racks.


On Monday, February 25, 2013 2:10:27 AM UTC-8, Clinton Gormley wrote:
On Sun, 2013-02-24 at 12:44 -0800, <a href="javascript:" target="_blank" gdf-obfuscated-mailto="bCw45lN0Jl0J">beha...@... wrote:
> I'm using elasticsearch for logstash and I would like to create a
> client only (30 node)cluster with 0 shards are distributed and no
> replication.

You need to have shards to index data.  Shards are not just for
replication.  Without any shards, you can't index any data.

Do you mean that you don't want any replica shards? You can control the
number of primary shards that an index contains when you create the
index, and the number_of_replicas can be updated at any time.

clint


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

Clinton Gormley-2
On Mon, 2013-02-25 at 10:09 -0800, [hidden email] wrote:
> Sorry I meant I would like to create a 30 node cluster one per rack
> and ensure that the 1 shard stays within the rack.
> Is that possible within elasticsearch?
>
> The cluster would be only used o search one instance and look at all
> data without having data get replicated or sharded between racks.

Sorry but I still have no idea what you're trying to achieve. One node
per rack? You have 30 racks? /me is lost

Perhaps some more detail...

clint


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris
So I have 30 racks at a colo and have 1 es instance per rack.
The 1 es instance per rack is used to index all logs for that rack only.

Is there anyway for each instance to join a 30 node cluster as a (client only) and not replicate or shard data between them?

The purpose of the client only cluster would allow me to search one instance and have es query all members for data.




On Monday, February 25, 2013 12:31:01 PM UTC-8, Clinton Gormley wrote:
On Mon, 2013-02-25 at 10:09 -0800, <a href="javascript:" target="_blank" gdf-obfuscated-mailto="FxXxL2ees24J">beha...@... wrote:
> Sorry I meant I would like to create a 30 node cluster one per rack
> and ensure that the 1 shard stays within the rack.
> Is that possible within elasticsearch?
>
> The cluster would be only used o search one instance and look at all
> data without having data get replicated or sharded between racks.

Sorry but I still have no idea what you're trying to achieve. One node
per rack? You have 30 racks? /me is lost

Perhaps some more detail...

clint


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

Clinton Gormley-2
Hiya

OK, the picture is slowly evolving :)

On Mon, 2013-02-25 at 16:23 -0800, [hidden email] wrote:
> So I have 30 racks at a colo and have 1 es instance per rack.
> The 1 es instance per rack is used to index all logs for that rack
> only.
>
> Is there anyway for each instance to join a 30 node cluster as a
> (client only) and not replicate or shard data between them?

A "client" in Elasticsearch terminology doesn't hold any data.  Hence
part of the confusion.  I think what you're asking is: Can I have an
index on a single node in the cluster?

The answer is yes: you can create 30 indices, and specify rack awareness
for each index, so that each index sits in a single rack.
>
> The purpose of the client only cluster would allow me to search one
> instance and have es query all members for data.

Yes, you can connect to any node in the cluster and query one or more
indices.  It will forward queries to all relevant nodes.

Note: I don't recommend this setup.  Especially with 30 nodes, the
chances of one of them going down is pretty high.  Hardware fails.  With
your current setup (esp if you don't have any replicas) then you run a
good chance of losing data.

Why not just use Elasticsearch as the distributed system that it is
intended to be?

clint


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris
I'm using elasticsearch for logstash and we generate tons of logs per rack.

I would like to keep the data local to the rack so the traffic doesn't need to cross switches. I don't have an immediate concern with failure or losing any data so I'm ok with losing a node and can just reprocess the logs for now.

If I use the distributed system, can I force data to stay local to the rack?



On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton Gormley wrote:
Hiya

OK, the picture is slowly evolving :)

On Mon, 2013-02-25 at 16:23 -0800, <a href="javascript:" target="_blank" gdf-obfuscated-mailto="CiFpk7juCNcJ">beha...@... wrote:
> So I have 30 racks at a colo and have 1 es instance per rack.
> The 1 es instance per rack is used to index all logs for that rack
> only.
>
> Is there anyway for each instance to join a 30 node cluster as a
> (client only) and not replicate or shard data between them?

A "client" in Elasticsearch terminology doesn't hold any data.  Hence
part of the confusion.  I think what you're asking is: Can I have an
index on a single node in the cluster?

The answer is yes: you can create 30 indices, and specify rack awareness
for each index, so that each index sits in a single rack.
>
> The purpose of the client only cluster would allow me to search one
> instance and have es query all members for data.

Yes, you can connect to any node in the cluster and query one or more
indices.  It will forward queries to all relevant nodes.

Note: I don't recommend this setup.  Especially with 30 nodes, the
chances of one of them going down is pretty high.  Hardware fails.  With
your current setup (esp if you don't have any replicas) then you run a
good chance of losing data.

Why not just use Elasticsearch as the distributed system that it is
intended to be?

clint


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

Clinton Gormley-2

> If I use the distributed system, can I force data to stay local to the
> rack?

Yes. Look for rack awareness in the docs

>
>
>
> On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton Gormley wrote:
>         Hiya
>        
>         OK, the picture is slowly evolving :)
>        
>         On Mon, 2013-02-25 at 16:23 -0800, [hidden email] wrote:
>         > So I have 30 racks at a colo and have 1 es instance per
>         rack.
>         > The 1 es instance per rack is used to index all logs for
>         that rack
>         > only.
>         >
>         > Is there anyway for each instance to join a 30 node cluster
>         as a
>         > (client only) and not replicate or shard data between them?
>        
>         A "client" in Elasticsearch terminology doesn't hold any
>         data.  Hence
>         part of the confusion.  I think what you're asking is: Can I
>         have an
>         index on a single node in the cluster?
>        
>         The answer is yes: you can create 30 indices, and specify rack
>         awareness
>         for each index, so that each index sits in a single rack.
>         >
>         > The purpose of the client only cluster would allow me to
>         search one
>         > instance and have es query all members for data.
>        
>         Yes, you can connect to any node in the cluster and query one
>         or more
>         indices.  It will forward queries to all relevant nodes.
>        
>         Note: I don't recommend this setup.  Especially with 30 nodes,
>         the
>         chances of one of them going down is pretty high.  Hardware
>         fails.  With
>         your current setup (esp if you don't have any replicas) then
>         you run a
>         good chance of losing data.
>        
>         Why not just use Elasticsearch as the distributed system that
>         it is
>         intended to be?
>        
>         clint
>        
>        
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris
Great, I checked out the docs and can came up with following config to enable rack awareness for each instance that would look something like this.

node:
        name: node1
        rack_id: rack1
cluster:
        name: elasticsearch
        routing:
                allocation:
                        awareness:
                                attributes: rack_id

Since logstash creates an index per day(logstash-2013.02.26) per node, would I have to do anything special to make sure that an index and it's shards are created for each node individually?

Currently only 5 shards are allocated in my shard=5 rep=0 cluster receiving logs for +1k servers under low load. The goal is to have all 30 nodes to have an index started and receiving logs for its rack to handle high load.

>es shards -v
index               shard pri/rep      state          docs    size       bytes node
logstash-2013.02.26     0 p       STARTED    28031927  22.9gb 24677604861 n7
logstash-2013.02.26     1 p       STARTED    26853297    22gb 23641399741 n18
logstash-2013.02.26     2 p       STARTED    28035826  22.9gb 24686606451 n21
logstash-2013.02.26     3 p       STARTED    28033599  22.9gb 24695469792 n24
logstash-2013.02.26     4 p       STARTED    28037600  22.9gb 24687686161 n5

Any recommendations on how to configure elasticsearch rack awareness and routing to handle this.

On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley wrote:

> If I use the distributed system, can I force data to stay local to the
> rack?

Yes. Look for rack awareness in the docs

>
>
>
> On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton Gormley wrote:
>         Hiya
>        
>         OK, the picture is slowly evolving :)
>        
>         On Mon, 2013-02-25 at 16:23 -0800, [hidden email] wrote:
>         > So I have 30 racks at a colo and have 1 es instance per
>         rack.
>         > The 1 es instance per rack is used to index all logs for
>         that rack
>         > only.
>         >
>         > Is there anyway for each instance to join a 30 node cluster
>         as a
>         > (client only) and not replicate or shard data between them?
>        
>         A "client" in Elasticsearch terminology doesn't hold any
>         data.  Hence
>         part of the confusion.  I think what you're asking is: Can I
>         have an
>         index on a single node in the cluster?
>        
>         The answer is yes: you can create 30 indices, and specify rack
>         awareness
>         for each index, so that each index sits in a single rack.
>         >
>         > The purpose of the client only cluster would allow me to
>         search one
>         > instance and have es query all members for data.
>        
>         Yes, you can connect to any node in the cluster and query one
>         or more
>         indices.  It will forward queries to all relevant nodes.
>        
>         Note: I don't recommend this setup.  Especially with 30 nodes,
>         the
>         chances of one of them going down is pretty high.  Hardware
>         fails.  With
>         your current setup (esp if you don't have any replicas) then
>         you run a
>         good chance of losing data.
>        
>         Why not just use Elasticsearch as the distributed system that
>         it is
>         intended to be?
>        
>         clint
>        
>        
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="s3hORjFp_dkJ">elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

Clinton Gormley-2
On Wed, 2013-02-27 at 00:16 -0800, B wrote:

> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>        
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>        
>         Yes. Look for rack awareness in the docs
>        
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >        
>         >         OK, the picture is slowly evolving :)
>         >        
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >        
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >        
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >        
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >        
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >        
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >        
>         >         clint
>         >        
>         >        
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to [hidden email].
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >  
>         >  
>        
>        
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to [hidden email].
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-<a href="tel:2013.02.28%2027" value="+12013022827" target="_blank">2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb <a href="tel:2764377828" value="+12764377828" target="_blank">2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to [hidden email].
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

Otis Gospodnetic
Hi,

0 replicas is risky.  1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html



On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote:

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="C-p0SMsnSyoJ">beha...@...> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="C-p0SMsnSyoJ">cl...@...> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to elasticsearc...@googlegroups.com.
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="C-p0SMsnSyoJ">elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="C-p0SMsnSyoJ">elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris
Otis

using 28 nodes with1 daily index what would be the best shard and replication scheme?



On Wed, Mar 6, 2013 at 3:39 AM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

0 replicas is risky.  1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html



On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote:

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to elasticsearc...@googlegroups.com.
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

dadoonet
How many days you want to keep in your cluster?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 mars 2013 à 07:58, Brian Harris <[hidden email]> a écrit :

Otis

using 28 nodes with1 daily index what would be the best shard and replication scheme?



On Wed, Mar 6, 2013 at 3:39 AM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

0 replicas is risky.  1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html



On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote:

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to elasticsearc...@googlegroups.com.
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris

Two weeks to start.

On Mar 6, 2013 11:34 PM, "David Pilato" <[hidden email]> wrote:
How many days you want to keep in your cluster?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 mars 2013 à 07:58, Brian Harris <[hidden email]> a écrit :

Otis

using 28 nodes with1 daily index what would be the best shard and replication scheme?



On Wed, Mar 6, 2013 at 3:39 AM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

0 replicas is risky.  1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html



On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote:

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to elasticsearc...@googlegroups.com.
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

dadoonet
2 weeks = 14 days.
Let's say you define one shard per day.
With a replica.

So you will have to store 28 shards (1 primary + 1 replica for each day).
You have 28 nodes? Cool. It will fit perfectly.

I think I would start with these numbers and see where it goes.
Main question is: can a single shard handle your daily logs?

If not, you will have to adjust the number of shards per index. And have more than one shard (or replica) on a single box.

-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 7 mars 2013 à 14:31, Brian Harris <[hidden email]> a écrit :

Two weeks to start.

On Mar 6, 2013 11:34 PM, "David Pilato" <[hidden email]> wrote:
How many days you want to keep in your cluster?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 mars 2013 à 07:58, Brian Harris <[hidden email]> a écrit :

Otis

using 28 nodes with1 daily index what would be the best shard and replication scheme?



On Wed, Mar 6, 2013 at 3:39 AM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

0 replicas is risky.  1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html



On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote:

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to elasticsearc...@googlegroups.com.
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.




--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Elasticsearch client only cluster

beharris

Hmm, yes i have 28 nodes in different racks.
A single shard cant handle the load.

On Mar 7, 2013 5:48 AM, "David Pilato" <[hidden email]> wrote:
2 weeks = 14 days.
Let's say you define one shard per day.
With a replica.

So you will have to store 28 shards (1 primary + 1 replica for each day).
You have 28 nodes? Cool. It will fit perfectly.

I think I would start with these numbers and see where it goes.
Main question is: can a single shard handle your daily logs?

If not, you will have to adjust the number of shards per index. And have more than one shard (or replica) on a single box.

-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 7 mars 2013 à 14:31, Brian Harris <[hidden email]> a écrit :

Two weeks to start.

On Mar 6, 2013 11:34 PM, "David Pilato" <[hidden email]> wrote:
How many days you want to keep in your cluster?

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 mars 2013 à 07:58, Brian Harris <[hidden email]> a écrit :

Otis

using 28 nodes with1 daily index what would be the best shard and replication scheme?



On Wed, Mar 6, 2013 at 3:39 AM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

0 replicas is risky.  1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better.

Otis
--
ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html



On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote:

Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something?

Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future.

On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
ok, just playing around with 28 nodes I set shards=28 and replicas=0.

I now have 28 shards all of my nodes now which is balanced.
This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below.

Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well.
What mechanism does elasticsearch use to keep the data balanced across all nodes?

28 shards all reported on all nodes.
logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4
..
logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19


On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to
> enable rack awareness for each instance that would look something like
> this.
>
> node:
>         name: node1
>         rack_id: rack1
> cluster:
>         name: elasticsearch
>         routing:
>                 allocation:
>                         awareness:
>                                 attributes: rack_id
>
> Since logstash creates an index per day(logstash-2013.02.26) per node,
> would I have to do anything special to make sure that an index and
> it's shards are created for each node individually?

If you want 30 indices on 30 different nodes, then you need to create
each index with a different name, and set the allocation on each index
to tie it to a single node.

http://www.elasticsearch.org/guide/reference/index-modules/allocation.html

For instance, you can use index templates to say: if the index name
matches "node_1_*" then set index.routing.allocation.include.rack_id to
"node_1"

http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html

clint

>
> Currently only 5 shards are allocated in my shard=5 rep=0 cluster
> receiving logs for +1k servers under low load. The goal is to have all
> 30 nodes to have an index started and receiving logs for its rack to
> handle high load.
>
> >es shards -v
> index               shard pri/rep      state          docs    size
> bytes node
> logstash-2013.02.26     0 p       STARTED    28031927  22.9gb
> 24677604861 n7
> logstash-2013.02.26     1 p       STARTED    26853297    22gb
> 23641399741 n18
> logstash-2013.02.26     2 p       STARTED    28035826  22.9gb
> 24686606451 n21
> logstash-2013.02.26     3 p       STARTED    28033599  22.9gb
> 24695469792 n24
> logstash-2013.02.26     4 p       STARTED    28037600  22.9gb
> 24687686161 n5
>
> Any recommendations on how to configure elasticsearch rack awareness
> and routing to handle this.
>
> On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley
> wrote:
>
>         > If I use the distributed system, can I force data to stay
>         local to the
>         > rack?
>
>         Yes. Look for rack awareness in the docs
>
>         >
>         >
>         >
>         > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton
>         Gormley wrote:
>         >         Hiya
>         >
>         >         OK, the picture is slowly evolving :)
>         >
>         >         On Mon, 2013-02-25 at 16:23 -0800, [hidden email]
>         wrote:
>         >         > So I have 30 racks at a colo and have 1 es
>         instance per
>         >         rack.
>         >         > The 1 es instance per rack is used to index all
>         logs for
>         >         that rack
>         >         > only.
>         >         >
>         >         > Is there anyway for each instance to join a 30
>         node cluster
>         >         as a
>         >         > (client only) and not replicate or shard data
>         between them?
>         >
>         >         A "client" in Elasticsearch terminology doesn't hold
>         any
>         >         data.  Hence
>         >         part of the confusion.  I think what you're asking
>         is: Can I
>         >         have an
>         >         index on a single node in the cluster?
>         >
>         >         The answer is yes: you can create 30 indices, and
>         specify rack
>         >         awareness
>         >         for each index, so that each index sits in a single
>         rack.
>         >         >
>         >         > The purpose of the client only cluster would allow
>         me to
>         >         search one
>         >         > instance and have es query all members for data.
>         >
>         >         Yes, you can connect to any node in the cluster and
>         query one
>         >         or more
>         >         indices.  It will forward queries to all relevant
>         nodes.
>         >
>         >         Note: I don't recommend this setup.  Especially with
>         30 nodes,
>         >         the
>         >         chances of one of them going down is pretty high.
>          Hardware
>         >         fails.  With
>         >         your current setup (esp if you don't have any
>         replicas) then
>         >         you run a
>         >         good chance of losing data.
>         >
>         >         Why not just use Elasticsearch as the distributed
>         system that
>         >         it is
>         >         intended to be?
>         >
>         >         clint
>         >
>         >
>         >
>         > --
>         > You received this message because you are subscribed to the
>         Google
>         > Groups "elasticsearch" group.
>         > To unsubscribe from this group and stop receiving emails
>         from it, send
>         > an email to elasticsearc...@googlegroups.com.
>         > For more options, visit
>         https://groups.google.com/groups/opt_out.
>         >
>         >
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.




--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/1ZUNmPHD8wY/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Loading...