|
I'm using elasticsearch for logstash and I would like to create a client only (30 node)cluster with 0 shards are distributed and no replication.
I have tons of logs so have configured logging per rack and I want to be able to connect to any es instance to search all the cluster nodes. This setup has worked but when nodes get restarted it seems to affect the cluster health which causes ":exception=>org.elasticsearch.action.UnavailableShardsException". My current setting per es instances is unicast discovery shards=5 replication=0 current health { "cluster_name" : "elasticsearch", "status" : "red", "timed_out" : false, "number_of_nodes" : 52, "number_of_data_nodes" : 26, "active_primary_shards" : 22, "active_shards" : 22, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 3 } If I lose a few nodes I don't want it to affect the rest from starting up or processing correctly. Am I missing any settings for a client only cluster? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
On Sun, 2013-02-24 at 12:44 -0800, [hidden email] wrote:
> I'm using elasticsearch for logstash and I would like to create a > client only (30 node)cluster with 0 shards are distributed and no > replication. You need to have shards to index data. Shards are not just for replication. Without any shards, you can't index any data. Do you mean that you don't want any replica shards? You can control the number of primary shards that an index contains when you create the index, and the number_of_replicas can be updated at any time. clint -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Sorry I meant I would like to create a 30 node cluster one per rack and ensure that the 1 shard stays within the rack.
Is that possible within elasticsearch? The cluster would be only used o search one instance and look at all data without having data get replicated or sharded between racks. On Monday, February 25, 2013 2:10:27 AM UTC-8, Clinton Gormley wrote: On Sun, 2013-02-24 at 12:44 -0800, <a href="javascript:" target="_blank" gdf-obfuscated-mailto="bCw45lN0Jl0J">beha...@... wrote:-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
On Mon, 2013-02-25 at 10:09 -0800, [hidden email] wrote:
> Sorry I meant I would like to create a 30 node cluster one per rack > and ensure that the 1 shard stays within the rack. > Is that possible within elasticsearch? > > The cluster would be only used o search one instance and look at all > data without having data get replicated or sharded between racks. Sorry but I still have no idea what you're trying to achieve. One node per rack? You have 30 racks? /me is lost Perhaps some more detail... clint -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
So I have 30 racks at a colo and have 1 es instance per rack.
The 1 es instance per rack is used to index all logs for that rack only. Is there anyway for each instance to join a 30 node cluster as a (client only) and not replicate or shard data between them? The purpose of the client only cluster would allow me to search one instance and have es query all members for data. On Monday, February 25, 2013 12:31:01 PM UTC-8, Clinton Gormley wrote: On Mon, 2013-02-25 at 10:09 -0800, <a href="javascript:" target="_blank" gdf-obfuscated-mailto="FxXxL2ees24J">beha...@... wrote:-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Hiya
OK, the picture is slowly evolving :) On Mon, 2013-02-25 at 16:23 -0800, [hidden email] wrote: > So I have 30 racks at a colo and have 1 es instance per rack. > The 1 es instance per rack is used to index all logs for that rack > only. > > Is there anyway for each instance to join a 30 node cluster as a > (client only) and not replicate or shard data between them? A "client" in Elasticsearch terminology doesn't hold any data. Hence part of the confusion. I think what you're asking is: Can I have an index on a single node in the cluster? The answer is yes: you can create 30 indices, and specify rack awareness for each index, so that each index sits in a single rack. > > The purpose of the client only cluster would allow me to search one > instance and have es query all members for data. Yes, you can connect to any node in the cluster and query one or more indices. It will forward queries to all relevant nodes. Note: I don't recommend this setup. Especially with 30 nodes, the chances of one of them going down is pretty high. Hardware fails. With your current setup (esp if you don't have any replicas) then you run a good chance of losing data. Why not just use Elasticsearch as the distributed system that it is intended to be? clint -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
I'm using elasticsearch for logstash and we generate tons of logs per rack.
I would like to keep the data local to the rack so the traffic doesn't need to cross switches. I don't have an immediate concern with failure or losing any data so I'm ok with losing a node and can just reprocess the logs for now. If I use the distributed system, can I force data to stay local to the rack? On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton Gormley wrote: Hiya-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
> If I use the distributed system, can I force data to stay local to the > rack? Yes. Look for rack awareness in the docs > > > > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton Gormley wrote: > Hiya > > OK, the picture is slowly evolving :) > > On Mon, 2013-02-25 at 16:23 -0800, [hidden email] wrote: > > So I have 30 racks at a colo and have 1 es instance per > rack. > > The 1 es instance per rack is used to index all logs for > that rack > > only. > > > > Is there anyway for each instance to join a 30 node cluster > as a > > (client only) and not replicate or shard data between them? > > A "client" in Elasticsearch terminology doesn't hold any > data. Hence > part of the confusion. I think what you're asking is: Can I > have an > index on a single node in the cluster? > > The answer is yes: you can create 30 indices, and specify rack > awareness > for each index, so that each index sits in a single rack. > > > > The purpose of the client only cluster would allow me to > search one > > instance and have es query all members for data. > > Yes, you can connect to any node in the cluster and query one > or more > indices. It will forward queries to all relevant nodes. > > Note: I don't recommend this setup. Especially with 30 nodes, > the > chances of one of them going down is pretty high. Hardware > fails. With > your current setup (esp if you don't have any replicas) then > you run a > good chance of losing data. > > Why not just use Elasticsearch as the distributed system that > it is > intended to be? > > clint > > > > -- > You received this message because you are subscribed to the Google > Groups "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to [hidden email]. > For more options, visit https://groups.google.com/groups/opt_out. > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Great, I checked out the docs and can came up with following config to
enable rack awareness for each instance that would look something like
this.
node: name: node1 rack_id: rack1 cluster: name: elasticsearch routing: allocation: awareness: attributes: rack_id Since logstash creates an index per day(logstash-2013.02.26) per node, would I have to do anything special to make sure that an index and it's shards are created for each node individually? Currently only 5 shards are allocated in my shard=5 rep=0 cluster receiving logs for +1k servers under low load. The goal is to have all 30 nodes to have an index started and receiving logs for its rack to handle high load. >es shards -v index shard pri/rep state docs size bytes node logstash-2013.02.26 0 p STARTED 28031927 22.9gb 24677604861 n7 logstash-2013.02.26 1 p STARTED 26853297 22gb 23641399741 n18 logstash-2013.02.26 2 p STARTED 28035826 22.9gb 24686606451 n21 logstash-2013.02.26 3 p STARTED 28033599 22.9gb 24695469792 n24 logstash-2013.02.26 4 p STARTED 28037600 22.9gb 24687686161 n5 Any recommendations on how to configure elasticsearch rack awareness and routing to handle this. On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley wrote: -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
On Wed, 2013-02-27 at 00:16 -0800, B wrote:
> Great, I checked out the docs and can came up with following config to > enable rack awareness for each instance that would look something like > this. > > node: > name: node1 > rack_id: rack1 > cluster: > name: elasticsearch > routing: > allocation: > awareness: > attributes: rack_id > > Since logstash creates an index per day(logstash-2013.02.26) per node, > would I have to do anything special to make sure that an index and > it's shards are created for each node individually? If you want 30 indices on 30 different nodes, then you need to create each index with a different name, and set the allocation on each index to tie it to a single node. http://www.elasticsearch.org/guide/reference/index-modules/allocation.html For instance, you can use index templates to say: if the index name matches "node_1_*" then set index.routing.allocation.include.rack_id to "node_1" http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html clint > > Currently only 5 shards are allocated in my shard=5 rep=0 cluster > receiving logs for +1k servers under low load. The goal is to have all > 30 nodes to have an index started and receiving logs for its rack to > handle high load. > > >es shards -v > index shard pri/rep state docs size > bytes node > logstash-2013.02.26 0 p STARTED 28031927 22.9gb > 24677604861 n7 > logstash-2013.02.26 1 p STARTED 26853297 22gb > 23641399741 n18 > logstash-2013.02.26 2 p STARTED 28035826 22.9gb > 24686606451 n21 > logstash-2013.02.26 3 p STARTED 28033599 22.9gb > 24695469792 n24 > logstash-2013.02.26 4 p STARTED 28037600 22.9gb > 24687686161 n5 > > Any recommendations on how to configure elasticsearch rack awareness > and routing to handle this. > > On Tuesday, February 26, 2013 10:40:36 AM UTC-8, Clinton Gormley > wrote: > > > If I use the distributed system, can I force data to stay > local to the > > rack? > > Yes. Look for rack awareness in the docs > > > > > > > > > On Tuesday, February 26, 2013 1:41:46 AM UTC-8, Clinton > Gormley wrote: > > Hiya > > > > OK, the picture is slowly evolving :) > > > > On Mon, 2013-02-25 at 16:23 -0800, [hidden email] > wrote: > > > So I have 30 racks at a colo and have 1 es > instance per > > rack. > > > The 1 es instance per rack is used to index all > logs for > > that rack > > > only. > > > > > > Is there anyway for each instance to join a 30 > node cluster > > as a > > > (client only) and not replicate or shard data > between them? > > > > A "client" in Elasticsearch terminology doesn't hold > any > > data. Hence > > part of the confusion. I think what you're asking > is: Can I > > have an > > index on a single node in the cluster? > > > > The answer is yes: you can create 30 indices, and > specify rack > > awareness > > for each index, so that each index sits in a single > rack. > > > > > > The purpose of the client only cluster would allow > me to > > search one > > > instance and have es query all members for data. > > > > Yes, you can connect to any node in the cluster and > query one > > or more > > indices. It will forward queries to all relevant > nodes. > > > > Note: I don't recommend this setup. Especially with > 30 nodes, > > the > > chances of one of them going down is pretty high. > Hardware > > fails. With > > your current setup (esp if you don't have any > replicas) then > > you run a > > good chance of losing data. > > > > Why not just use Elasticsearch as the distributed > system that > > it is > > intended to be? > > > > clint > > > > > > > > -- > > You received this message because you are subscribed to the > > Groups "elasticsearch" group. > > To unsubscribe from this group and stop receiving emails > from it, send > > an email to [hidden email]. > > For more options, visit > https://groups.google.com/groups/opt_out. > > > > > > > > -- > You received this message because you are subscribed to the Google > Groups "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to [hidden email]. > For more options, visit https://groups.google.com/groups/opt_out. > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
ok, just playing around with 28 nodes I set shards=28 and replicas=0.
I now have 28 shards all of my nodes now which is balanced. This is definitely an alternate setup that will work but I will still to try to find out how to keep logs local to the rack using the templates you listed below. Perhaps I don't need to keep logs per rack as this seems to balance out the storage pretty well. What mechanism does elasticsearch use to keep the data balanced across all nodes? 28 shards all reported on all nodes. logstash-2013.02.28 27 p STARTED 2972868 2.5gb 2753226040 node4 .. logstash-2013.02.28 0 p STARTED 2972863 2.5gb 2764377828 node19 On Wed, Feb 27, 2013 at 2:58 AM, Clinton Gormley <[hidden email]> wrote:
-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Clinton, if i keep to the distributed model of a 28 node cluster with all using the same index. Would shards=14 , repl=1 be a wise choice to keep all nodes doing something? Right now testing shards=28 and repl=0 which is working great and the cluster is balanced with data and load but thinking long term of node failures and adding nodes in the future. On Feb 27, 2013 11:43 PM, "Brian Harris" <[hidden email]> wrote:
-- ok, just playing around with 28 nodes I set shards=28 and replicas=0. You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Hi, 0 replicas is risky. 1 node dies and you are missing ~1/28th of your data. So repl > 0 is definitely better. Otis -- ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html On Wednesday, March 6, 2013 1:56:53 AM UTC-5, B wrote: -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Otis
using 28 nodes with1 daily index what would be the best shard and replication scheme? On Wed, Mar 6, 2013 at 3:39 AM, Otis Gospodnetic <[hidden email]> wrote:
-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
How many days you want to keep in your cluster? -- David ;-)Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Two weeks to start. On Mar 6, 2013 11:34 PM, "David Pilato" <[hidden email]> wrote:
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
2 weeks = 14 days.
Let's say you define one shard per day. With a replica. So you will have to store 28 shards (1 primary + 1 replica for each day). You have 28 nodes? Cool. It will fit perfectly. I think I would start with these numbers and see where it goes. Main question is: can a single shard handle your daily logs? If not, you will have to adjust the number of shards per index. And have more than one shard (or replica) on a single box. -- Le 7 mars 2013 à 14:31, Brian Harris <[hidden email]> a écrit :
You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Hmm, yes i have 28 nodes in different racks. On Mar 7, 2013 5:48 AM, "David Pilato" <[hidden email]> wrote:
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
| Powered by Nabble | Edit this page |
