using RAID 0 vs multiple data paths after commit #10461

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

using RAID 0 vs multiple data paths after commit #10461

Scott-3
Hello all,
I currently have 2 servers, each with 128GB RAM and 2 HDDs in a RAID 1, and 10 SDDs.  I was instructed to create a ELK stack on these servers, using the SDDs as separate disks and pointing ES to each as a path.data config.  The reasoning behind this was to avoid the speed hit from the RAID controller, as well as flexibility.  My plan was to have 1 server have a single instance of ES with 31GB RAM and the rest of the memory for Kibana, Logstash, Redis, and the 2nd server to have 2 instances of ES with 31 GB each.  When I researched about RAID vs the multiple data paths, everybody seems to recommend not using RAID (or using RAID 0), but I fail to see how multiple data paths will give flexibility.  Up until today, I was under the impression that ES stripes the data anyway, so it'll be like a RAID 0 (except with a lot more work on the fstab side of things).  If one disk went down, you'd lose the whole node since it didn't care about where it placed the data.  Now, however, I read the commit #10461 on github and it seems to indicate that the code was changed to allow a single path for each shard?  If that's the case, and I have 3 shards + 1 replica each (because we have 3 nodes), how does this utilize all 10 SDDs?  If this is NOT the case, the only real data redundancy and resiliency is still via the replicas, correct?  So it doesn't really matter RAID 0 vs multiple data paths?  Can anybody shed some light on this issue?  I appreciate any and all help!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/94434290-c458-45d5-b46e-c09445b5b3dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: using RAID 0 vs multiple data paths after commit #10461

Nikolas Everett
I'm RAID 0 all the way. The striping is much more complete then ES's path.data and operations is more used to the tool around it. Software raid in linux is fine for this. We only do two disks in RAID 0 though because we don't like the increased failure chance. So 10 in RAID 0 is a bit much. 10 disks isn't where our sweet spot is for ES so I haven't thought about it.

On Mon, May 18, 2015 at 11:20 AM, Scott <[hidden email]> wrote:
Hello all,
I currently have 2 servers, each with 128GB RAM and 2 HDDs in a RAID 1, and 10 SDDs.  I was instructed to create a ELK stack on these servers, using the SDDs as separate disks and pointing ES to each as a path.data config.  The reasoning behind this was to avoid the speed hit from the RAID controller, as well as flexibility.  My plan was to have 1 server have a single instance of ES with 31GB RAM and the rest of the memory for Kibana, Logstash, Redis, and the 2nd server to have 2 instances of ES with 31 GB each.  When I researched about RAID vs the multiple data paths, everybody seems to recommend not using RAID (or using RAID 0), but I fail to see how multiple data paths will give flexibility.  Up until today, I was under the impression that ES stripes the data anyway, so it'll be like a RAID 0 (except with a lot more work on the fstab side of things).  If one disk went down, you'd lose the whole node since it didn't care about where it placed the data.  Now, however, I read the commit #10461 on github and it seems to indicate that the code was changed to allow a single path for each shard?  If that's the case, and I have 3 shards + 1 replica each (because we have 3 nodes), how does this utilize all 10 SDDs?  If this is NOT the case, the only real data redundancy and resiliency is still via the replicas, correct?  So it doesn't really matter RAID 0 vs multiple data paths?  Can anybody shed some light on this issue?  I appreciate any and all help!

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/94434290-c458-45d5-b46e-c09445b5b3dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Please update your bookmarks! We have moved to https://discuss.elastic.co/
---
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2vtH9bUYCDoVvV3%2BzAScWbguj5kE4PgPCDcL7xj%2BB0hw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.