tiering storage / Curator

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

tiering storage / Curator

Patrick Proniewski
Hello,

Curator makes is possible to migrate an index to another storage programmatically, and that's very nice to keep old indices on cheap storage. But if I understand correctly, a unique ES cluster cannot handle two different storages. Hence, having small but fast storage for recent files and cheap but slow storage for old files requires building two clusters.
Am I right?

thanks,
Patrick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/487D9125-FC9B-43F9-B714-9C4EA2556A47%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: tiering storage / Curator

Mark Walkom
Nope, you can use allocation awareness to have indexes on different machines - http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 15 July 2014 15:20, Patrick Proniewski <[hidden email]> wrote:
Hello,

Curator makes is possible to migrate an index to another storage programmatically, and that's very nice to keep old indices on cheap storage. But if I understand correctly, a unique ES cluster cannot handle two different storages. Hence, having small but fast storage for recent files and cheap but slow storage for old files requires building two clusters.
Am I right?

thanks,
Patrick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/487D9125-FC9B-43F9-B714-9C4EA2556A47%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aBWn1RruyyhchVeE5kOF_vFFusv2fejvjjqM3Y8PSpRw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: tiering storage / Curator

Patrick Proniewski
Ok, so if I understand correctly I can have a single cluster with:

- machine A: fast storage (recent data)
- machine B & C: slow storage (old data)

In that case, I cannot have a homogeneous cluster with both fast and slow storage on each node and I'm losing the benefit of having multiple machines when I index new data and when I search recent data. Is that correct?

Regards,
Patrick

On 15 juil. 2014, at 07:25, Mark Walkom <[hidden email]> wrote:

> Nope, you can use allocation awareness to have indexes on different
> machines -
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [hidden email]
> web: www.campaignmonitor.com
>
>
> On 15 July 2014 15:20, Patrick Proniewski <[hidden email]> wrote:
>
>> Hello,
>>
>> Curator makes is possible to migrate an index to another storage
>> programmatically, and that's very nice to keep old indices on cheap
>> storage. But if I understand correctly, a unique ES cluster cannot handle
>> two different storages. Hence, having small but fast storage for recent
>> files and cheap but slow storage for old files requires building two
>> clusters.
>> Am I right?
>>
>> thanks,
>> Patrick
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/487D9125-FC9B-43F9-B714-9C4EA2556A47%40patpro.net
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aBWn1RruyyhchVeE5kOF_vFFusv2fejvjjqM3Y8PSpRw%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/185792EE-3E9D-4EB6-A0B1-1E4B4FBC6F81%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: tiering storage / Curator

Mark Walkom
You cannot have multiple data.paths on a single node/instances. You could try running multiple instances of ES on a single physical, each pointing to either one of your tiered pools.
But you aren't losing the benefit of multiple nodes, just the optimal use of your storage on those physical nodes.

You could look at something like L2ARC or similar though.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 15 July 2014 17:05, Patrick Proniewski <[hidden email]> wrote:
Ok, so if I understand correctly I can have a single cluster with:

- machine A: fast storage (recent data)
- machine B & C: slow storage (old data)

In that case, I cannot have a homogeneous cluster with both fast and slow storage on each node and I'm losing the benefit of having multiple machines when I index new data and when I search recent data. Is that correct?

Regards,
Patrick

On 15 juil. 2014, at 07:25, Mark Walkom <[hidden email]> wrote:

> Nope, you can use allocation awareness to have indexes on different
> machines -
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [hidden email]
> web: www.campaignmonitor.com
>
>
> On 15 July 2014 15:20, Patrick Proniewski <[hidden email]> wrote:
>
>> Hello,
>>
>> Curator makes is possible to migrate an index to another storage
>> programmatically, and that's very nice to keep old indices on cheap
>> storage. But if I understand correctly, a unique ES cluster cannot handle
>> two different storages. Hence, having small but fast storage for recent
>> files and cheap but slow storage for old files requires building two
>> clusters.
>> Am I right?
>>
>> thanks,
>> Patrick
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/487D9125-FC9B-43F9-B714-9C4EA2556A47%40patpro.net
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aBWn1RruyyhchVeE5kOF_vFFusv2fejvjjqM3Y8PSpRw%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/185792EE-3E9D-4EB6-A0B1-1E4B4FBC6F81%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z6rMHS%2Br-2eeGWTpWq_bZ4iraEp0sE_91w8vDcJVe02w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: tiering storage / Curator

Patrick Proniewski
It seems I can have multiple path.data on a single node, but it does not allow for storage tiering:

# Can optionally include more than one location, causing data to be striped across
# the locations (a la RAID 0) on a file level, favouring locations with most free
# space on creation. For example:
#
# path.data: /path/to/data1,/path/to/data2

ES seems to be quite close to beeing able to provide storage tiering... Maybe in 1.4? ;)


On 15 juil. 2014, at 09:33, Mark Walkom <[hidden email]> wrote:

> You cannot have multiple data.paths on a single node/instances. You could try running multiple instances of ES on a single physical, each pointing to either one of your tiered pools.
> But you aren't losing the benefit of multiple nodes, just the optimal use of your storage on those physical nodes.
>
> You could look at something like L2ARC or similar though.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [hidden email]
> web: www.campaignmonitor.com
>
>
> On 15 July 2014 17:05, Patrick Proniewski <[hidden email]> wrote:
> Ok, so if I understand correctly I can have a single cluster with:
>
> - machine A: fast storage (recent data)
> - machine B & C: slow storage (old data)
>
> In that case, I cannot have a homogeneous cluster with both fast and slow storage on each node and I'm losing the benefit of having multiple machines when I index new data and when I search recent data. Is that correct?
>
> Regards,
> Patrick
>
> On 15 juil. 2014, at 07:25, Mark Walkom <[hidden email]> wrote:
>
> > Nope, you can use allocation awareness to have indexes on different
> > machines -
> > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
> >
> > Regards,
> > Mark Walkom
> >
> > Infrastructure Engineer
> > Campaign Monitor
> > email: [hidden email]
> > web: www.campaignmonitor.com
> >
> >
> > On 15 July 2014 15:20, Patrick Proniewski <[hidden email]> wrote:
> >
> >> Hello,
> >>
> >> Curator makes is possible to migrate an index to another storage
> >> programmatically, and that's very nice to keep old indices on cheap
> >> storage. But if I understand correctly, a unique ES cluster cannot handle
> >> two different storages. Hence, having small but fast storage for recent
> >> files and cheap but slow storage for old files requires building two
> >> clusters.
> >> Am I right?
> >>
> >> thanks,
> >> Patrick
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "elasticsearch" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to [hidden email].
> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/elasticsearch/487D9125-FC9B-43F9-B714-9C4EA2556A47%40patpro.net
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> > --
> > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> > To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aBWn1RruyyhchVeE5kOF_vFFusv2fejvjjqM3Y8PSpRw%40mail.gmail.com.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/185792EE-3E9D-4EB6-A0B1-1E4B4FBC6F81%40patpro.net.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z6rMHS%2Br-2eeGWTpWq_bZ4iraEp0sE_91w8vDcJVe02w%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/68787163-5E12-448B-9C0B-553DECF8D613%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: tiering storage / Curator

Mark Walkom
There you go, I didn't know it did that!

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 15 July 2014 18:35, Patrick Proniewski <[hidden email]> wrote:
It seems I can have multiple path.data on a single node, but it does not allow for storage tiering:

# Can optionally include more than one location, causing data to be striped across
# the locations (a la RAID 0) on a file level, favouring locations with most free
# space on creation. For example:
#
# path.data: /path/to/data1,/path/to/data2

ES seems to be quite close to beeing able to provide storage tiering... Maybe in 1.4? ;)


On 15 juil. 2014, at 09:33, Mark Walkom <[hidden email]> wrote:

> You cannot have multiple data.paths on a single node/instances. You could try running multiple instances of ES on a single physical, each pointing to either one of your tiered pools.
> But you aren't losing the benefit of multiple nodes, just the optimal use of your storage on those physical nodes.
>
> You could look at something like L2ARC or similar though.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [hidden email]
> web: www.campaignmonitor.com
>
>
> On 15 July 2014 17:05, Patrick Proniewski <[hidden email]> wrote:
> Ok, so if I understand correctly I can have a single cluster with:
>
> - machine A: fast storage (recent data)
> - machine B & C: slow storage (old data)
>
> In that case, I cannot have a homogeneous cluster with both fast and slow storage on each node and I'm losing the benefit of having multiple machines when I index new data and when I search recent data. Is that correct?
>
> Regards,
> Patrick
>
> On 15 juil. 2014, at 07:25, Mark Walkom <[hidden email]> wrote:
>
> > Nope, you can use allocation awareness to have indexes on different
> > machines -
> > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html
> >
> > Regards,
> > Mark Walkom
> >
> > Infrastructure Engineer
> > Campaign Monitor
> > email: [hidden email]
> > web: www.campaignmonitor.com
> >
> >
> > On 15 July 2014 15:20, Patrick Proniewski <[hidden email]> wrote:
> >
> >> Hello,
> >>
> >> Curator makes is possible to migrate an index to another storage
> >> programmatically, and that's very nice to keep old indices on cheap
> >> storage. But if I understand correctly, a unique ES cluster cannot handle
> >> two different storages. Hence, having small but fast storage for recent
> >> files and cheap but slow storage for old files requires building two
> >> clusters.
> >> Am I right?
> >>
> >> thanks,
> >> Patrick
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "elasticsearch" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to [hidden email].
> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/elasticsearch/487D9125-FC9B-43F9-B714-9C4EA2556A47%40patpro.net
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> > --
> > You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> > To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aBWn1RruyyhchVeE5kOF_vFFusv2fejvjjqM3Y8PSpRw%40mail.gmail.com.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/185792EE-3E9D-4EB6-A0B1-1E4B4FBC6F81%40patpro.net.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z6rMHS%2Br-2eeGWTpWq_bZ4iraEp0sE_91w8vDcJVe02w%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/68787163-5E12-448B-9C0B-553DECF8D613%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Zxw2HNNvDFYdqsCurVFrFs_9%3DLOk1JM%2BkYkEzp23hTjw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: tiering storage / Curator

Otis Gospodnetic
In reply to this post by Patrick Proniewski
Hi,

On Tuesday, July 15, 2014 1:20:39 AM UTC-4, Patrick Proniewski wrote:
Hello,

Curator makes is possible to migrate an index to another storage programmatically, and that's very nice to keep old indices on cheap storage. But if I understand correctly, a unique ES cluster cannot handle two different storages. Hence, having small but fast storage for recent files and cheap but slow storage for old files requires building two clusters.
Am I right?

Not necessarily.  We used the tiered storage approach in Logsene, for example, but we explicitly move older indexes to from more expensive nodes that deal with fresh data to cheaper nodes that host all data.  It's automated, but it's not 100% done within ES.  But it's done with a single ES cluster.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0bbe88ad-4c60-4383-8ae9-13ab15d02676%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.