Any experience with ES and Data Compressing Filesystems?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Any experience with ES and Data Compressing Filesystems?

horst knete
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

Mark Walkom
There's a few previous threads on this topic in the archives, though I don't immediately recall seeing any performance metrics unfortunately.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 16 July 2014 20:56, horst knete <[hidden email]> wrote:
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624anydS9-aNyDYUXz3RgtSCYJn1XUTEzKyFUiNUJr8hrbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

joergprante@gmail.com
In reply to this post by horst knete
You will not gain much advantage because ES already compresses data on disk with LZF, ZFS is using LZ4, which compression output is quite similar. In the file system statistics you will notice the compression ratio, and this will be no good value. So instead of having ZFS trying to compress where not much can be gained, you should switch it off.

Jörg


On Wed, Jul 16, 2014 at 12:56 PM, horst knete <[hidden email]> wrote:
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGtZicXu8vLe9oBG8bKS3rLp771_chUXjLg5E2m%2BHSCJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

joergprante@gmail.com
Ups, not true, Elasticsearch uses Lucene codec compression, and this is also LZ4 (LZF only for backwards compatibility)

Here are some numbers:


Jörg


On Wed, Jul 16, 2014 at 2:28 PM, [hidden email] <[hidden email]> wrote:
You will not gain much advantage because ES already compresses data on disk with LZF, ZFS is using LZ4, which compression output is quite similar. In the file system statistics you will notice the compression ratio, and this will be no good value. So instead of having ZFS trying to compress where not much can be gained, you should switch it off.

Jörg


On Wed, Jul 16, 2014 at 12:56 PM, horst knete <[hidden email]> wrote:
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c6c806a-f638-4139-a080-3da7670f0eca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF%3D%3DEumbfUWcr0VyN4frFvAFqv8jHTmP%3DtBKB9jW%3D0oOQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

Otis Gospodnetic
In reply to this post by horst knete
Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should you try it anyway, I'd love to hear your findings/observations.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a92ce201-a228-407d-a9d4-613125488454%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

horst knete
Hey guys,

we have mounted an btrfs file system with the compression method "zlib" for testing purposes on our elasticsearchserver and copied one of the indices on the btrfs volume, unfortunately it had no success and still got the size of 50gb :/

I will further try it with other compression methods and will report here

Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:
Hi Horst,

I wouldn't bother with this for the reasons Joerg mentioned, but should you try it anyway, I'd love to hear your findings/observations.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * <a href="http://sematext.com/" style="color:rgb(17,85,204)" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;">http://sematext.com/



On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
Hey Guys,

to save a lot of hard disk space, we are going to use an compression file system, which allows us transparent compression for the es-indices. (It seems like es-indices are very good compressable, got up to 65% compression-rate in some tests).

Currently the indices are laying at a ext4-Linux Filesystem which unfortunately dont have the transparent compression ability.

Anyone of you got experience with compression file systems like BTRFS or ZFS/OpenZFS and can tell us if this led to big performance losses?

Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fab716e-dcef-4edf-b658-56922f8dee16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

Patrick Proniewski
Hi,

gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.9G    1.1T     0%    /zdata/ES-lz4
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem does show some benefit:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.1G    1.1T     0%    /zdata/ES-lz4 -> compressratio  1.73x
zdata/ES-gzip          1.1T    901M    1.1T     0%    /zdata/ES-gzip -> compressratio  2.27x
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete <[hidden email]> wrote:

> Hey guys,
>
> we have mounted an btrfs file system with the compression method "zlib" for
> testing purposes on our elasticsearchserver and copied one of the indices
> on the btrfs volume, unfortunately it had no success and still got the size
> of 50gb :/
>
> I will further try it with other compression methods and will report here
>
> Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:
>>
>> Hi Horst,
>>
>> I wouldn't bother with this for the reasons Joerg mentioned, but should
>> you try it anyway, I'd love to hear your findings/observations.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>
>> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
>>>
>>> Hey Guys,
>>>
>>> to save a lot of hard disk space, we are going to use an compression file
>>> system, which allows us transparent compression for the es-indices. (It
>>> seems like es-indices are very good compressable, got up to 65%
>>> compression-rate in some tests).
>>>
>>> Currently the indices are laying at a ext4-Linux Filesystem which
>>> unfortunately dont have the transparent compression ability.
>>>
>>> Anyone of you got experience with compression file systems like BTRFS or
>>> ZFS/OpenZFS and can tell us if this led to big performance losses?
>>>
>>> Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3DD72EC1-E3EC-493D-94DD-33E63151A579%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

horst knete
Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the amount of data from 3,2 TB to 1,1TB without any segnificant performance losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:
Hi,

gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.9G    1.1T     0%    /zdata/ES-lz4
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem does show some benefit:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.1G    1.1T     0%    /zdata/ES-lz4        -> compressratio  1.73x
zdata/ES-gzip          1.1T    901M    1.1T     0%    /zdata/ES-gzip        -> compressratio  2.27x
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="hgy4Ru0XoNsJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">badun...@...> wrote:

> Hey guys,
>
> we have mounted an btrfs file system with the compression method "zlib" for
> testing purposes on our elasticsearchserver and copied one of the indices
> on the btrfs volume, unfortunately it had no success and still got the size
> of 50gb :/
>
> I will further try it with other compression methods and will report here
>
> Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:
>>
>> Hi Horst,
>>
>> I wouldn't bother with this for the reasons Joerg mentioned, but should
>> you try it anyway, I'd love to hear your findings/observations.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * <a href="http://sematext.com/" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;">http://sematext.com/
>>
>>
>>
>> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
>>>
>>> Hey Guys,
>>>
>>> to save a lot of hard disk space, we are going to use an compression file
>>> system, which allows us transparent compression for the es-indices. (It
>>> seems like es-indices are very good compressable, got up to 65%
>>> compression-rate in some tests).
>>>
>>> Currently the indices are laying at a ext4-Linux Filesystem which
>>> unfortunately dont have the transparent compression ability.
>>>
>>> Anyone of you got experience with compression file systems like BTRFS or
>>> ZFS/OpenZFS and can tell us if this led to big performance losses?
>>>
>>> Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

Mark Walkom
What sort of data are you indexing? When you said performance impact was minimal, how minimal and at what points are you seeing it?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 4 August 2014 16:43, horst knete <[hidden email]> wrote:
Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the amount of data from 3,2 TB to 1,1TB without any segnificant performance losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:
Hi,

gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.9G    1.1T     0%    /zdata/ES-lz4
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem does show some benefit:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.1G    1.1T     0%    /zdata/ES-lz4        -> compressratio  1.73x
zdata/ES-gzip          1.1T    901M    1.1T     0%    /zdata/ES-gzip        -> compressratio  2.27x
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete <[hidden email]> wrote:

> Hey guys,
>
> we have mounted an btrfs file system with the compression method "zlib" for
> testing purposes on our elasticsearchserver and copied one of the indices
> on the btrfs volume, unfortunately it had no success and still got the size
> of 50gb :/
>
> I will further try it with other compression methods and will report here
>
> Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:
>>
>> Hi Horst,
>>
>> I wouldn't bother with this for the reasons Joerg mentioned, but should
>> you try it anyway, I'd love to hear your findings/observations.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * http://sematext.com/
>>
>>
>>
>> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
>>>
>>> Hey Guys,
>>>
>>> to save a lot of hard disk space, we are going to use an compression file
>>> system, which allows us transparent compression for the es-indices. (It
>>> seems like es-indices are very good compressable, got up to 65%
>>> compression-rate in some tests).
>>>
>>> Currently the indices are laying at a ext4-Linux Filesystem which
>>> unfortunately dont have the transparent compression ability.
>>>
>>> Anyone of you got experience with compression file systems like BTRFS or
>>> ZFS/OpenZFS and can tell us if this led to big performance losses?
>>>
>>> Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ayiQiTBeiNkn5NKRqm2Vyu5Bk2md8Gw%2BC7nSVJXF9%2BGw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

horst knete
We are indexing all sort of events (Windows, Linux, Apache, Netflow and so on...) and impact is defined in speed of the Kibana GUI / how long it takes to load 7 or 14 days of data. Thats what is important for my colleagues.


Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:
What sort of data are you indexing? When you said performance impact was minimal, how minimal and at what points are you seeing it?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: <a href="javascript:" target="_blank" gdf-obfuscated-mailto="qugiNJDxvYcJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">ma...@...
web: <a href="http://www.campaignmonitor.com" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;">www.campaignmonitor.com


On 4 August 2014 16:43, horst knete <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="qugiNJDxvYcJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">badun...@...> wrote:
Hi again,

a quick report regarding compression:

we are using a 3-TB btrfs-volume with 32k block size now which reduced the amount of data from 3,2 TB to 1,1TB without any segnificant performance losses ( we are using a 8 CPU, 20 GB Memory machine with an iSCSI.Link to the volume ).

So for us i can only suggest using the btrfs-volume for long term storage.

Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick Proniewski:
Hi,

gzip/zlib compression is very bad for performance, so it can be interesting for closed indices, but for live data I would not recommend it.
Also, you must know that:

Compression using lz4 is already enabled into indices,
ES/Lucene/Java usually read&write 4k blocks,

-> hence, compression is achieved on 4k blocks. If your filesystem uses 4k blocks and you add FS compression, you will probably have a very small gain, if any. I've tried on ZFS:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.9G    1.1T     0%    /zdata/ES-lz4
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

If you are using a larger block size, like 128k, a compressed filesystem does show some benefit:

Filesystem             Size    Used   Avail Capacity  Mounted on
zdata/ES-lz4           1.1T    1.1G    1.1T     0%    /zdata/ES-lz4        -> compressratio  1.73x
zdata/ES-gzip          1.1T    901M    1.1T     0%    /zdata/ES-gzip        -> compressratio  2.27x
zdata/ES               1.1T    1.9G    1.1T     0%    /zdata/ES

But a file system block larger than 4k is very suboptimal for IO (ES read or write one 4k block -> your FS must read or write a 128k block).

On 21 juil. 2014, at 07:58, horst knete <[hidden email]> wrote:

> Hey guys,
>
> we have mounted an btrfs file system with the compression method "zlib" for
> testing purposes on our elasticsearchserver and copied one of the indices
> on the btrfs volume, unfortunately it had no success and still got the size
> of 50gb :/
>
> I will further try it with other compression methods and will report here
>
> Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis Gospodnetic:
>>
>> Hi Horst,
>>
>> I wouldn't bother with this for the reasons Joerg mentioned, but should
>> you try it anyway, I'd love to hear your findings/observations.
>>
>> Otis
>> --
>> Performance Monitoring * Log Analytics * Search Analytics
>> Solr & Elasticsearch Support * <a href="http://sematext.com/" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;">http://sematext.com/
>>
>>
>>
>> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst knete wrote:
>>>
>>> Hey Guys,
>>>
>>> to save a lot of hard disk space, we are going to use an compression file
>>> system, which allows us transparent compression for the es-indices. (It
>>> seems like es-indices are very good compressable, got up to 65%
>>> compression-rate in some tests).
>>>
>>> Currently the indices are laying at a ext4-Linux Filesystem which
>>> unfortunately dont have the transparent compression ability.
>>>
>>> Anyone of you got experience with compression file systems like BTRFS or
>>> ZFS/OpenZFS and can tell us if this led to big performance losses?
>>>
>>> Thanks for responding

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="qugiNJDxvYcJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

joergprante@gmail.com
You are aware of the fact the kind of search performance you mean
depends on RAM and virtual memory organization of the cluster, not on
storage, so "without any siginifcant performace losses" could be expected ?

Jörg

Am 04.08.14 12:41, schrieb horst knete:

> We are indexing all sort of events (Windows, Linux, Apache, Netflow
> and so on...) and impact is defined in speed of the Kibana GUI / how
> long it takes to load 7 or 14 days of data. Thats what is important
> for my colleagues.
>
>
> Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:
>
>     What sort of data are you indexing? When you said performance
>     impact was minimal, how minimal and at what points are you seeing it?
>
>     Regards,
>     Mark Walkom
>
>     Infrastructure Engineer
>     Campaign Monitor
>     email: [hidden email]
>     web: www.campaignmonitor.com <http://www.campaignmonitor.com>
>
>
>     On 4 August 2014 16:43, horst knete <[hidden email]> wrote:
>
>         Hi again,
>
>         a quick report regarding compression:
>
>         we are using a 3-TB btrfs-volume with 32k block size now which
>         reduced the amount of data from 3,2 TB to 1,1TB without any
>         segnificant performance losses ( we are using a 8 CPU, 20 GB
>         Memory machine with an iSCSI.Link to the volume ).
>
>         So for us i can only suggest using the btrfs-volume for long
>         term storage.
>
>         Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick
>         Proniewski:
>
>             Hi,
>
>             gzip/zlib compression is very bad for performance, so it
>             can be interesting for closed indices, but for live data I
>             would not recommend it.
>             Also, you must know that:
>
>             Compression using lz4 is already enabled into indices,
>             ES/Lucene/Java usually read&write 4k blocks,
>
>             -> hence, compression is achieved on 4k blocks. If your
>             filesystem uses 4k blocks and you add FS compression, you
>             will probably have a very small gain, if any. I've tried
>             on ZFS:
>
>             Filesystem             Size    Used   Avail Capacity
>              Mounted on
>             zdata/ES-lz4           1.1T    1.9G    1.1T 0%  
>              /zdata/ES-lz4
>             zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES
>
>             If you are using a larger block size, like 128k, a
>             compressed filesystem does show some benefit:
>
>             Filesystem             Size    Used   Avail Capacity
>              Mounted on
>             zdata/ES-lz4           1.1T    1.1G    1.1T 0%  
>              /zdata/ES-lz4        -> compressratio  1.73x
>             zdata/ES-gzip          1.1T    901M    1.1T 0%  
>              /zdata/ES-gzip        -> compressratio  2.27x
>             zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES
>
>             But a file system block larger than 4k is very suboptimal
>             for IO (ES read or write one 4k block -> your FS must read
>             or write a 128k block).
>
>             On 21 juil. 2014, at 07:58, horst knete
>             <[hidden email]> wrote:
>
>             > Hey guys,
>             >
>             > we have mounted an btrfs file system with the
>             compression method "zlib" for
>             > testing purposes on our elasticsearchserver and copied
>             one of the indices
>             > on the btrfs volume, unfortunately it had no success and
>             still got the size
>             > of 50gb :/
>             >
>             > I will further try it with other compression methods and
>             will report here
>             >
>             > Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis
>             Gospodnetic:
>             >>
>             >> Hi Horst,
>             >>
>             >> I wouldn't bother with this for the reasons Joerg
>             mentioned, but should
>             >> you try it anyway, I'd love to hear your
>             findings/observations.
>             >>
>             >> Otis
>             >> --
>             >> Performance Monitoring * Log Analytics * Search Analytics
>             >> Solr & Elasticsearch Support * http://sematext.com/
>             >>
>             >>
>             >>
>             >> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst
>             knete wrote:
>             >>>
>             >>> Hey Guys,
>             >>>
>             >>> to save a lot of hard disk space, we are going to use
>             an compression file
>             >>> system, which allows us transparent compression for
>             the es-indices. (It
>             >>> seems like es-indices are very good compressable, got
>             up to 65%
>             >>> compression-rate in some tests).
>             >>>
>             >>> Currently the indices are laying at a ext4-Linux
>             Filesystem which
>             >>> unfortunately dont have the transparent compression
>             ability.
>             >>>
>             >>> Anyone of you got experience with compression file
>             systems like BTRFS or
>             >>> ZFS/OpenZFS and can tell us if this led to big
>             performance losses?
>             >>>
>             >>> Thanks for responding
>
>         --
>         You received this message because you are subscribed to the
>         Google Groups "elasticsearch" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to [hidden email].
>         To view this discussion on the web visit
>         https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com
>         <https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&utm_source=footer>.
>         For more options, visit https://groups.google.com/d/optout
>         <https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to [hidden email]
> <mailto:[hidden email]>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com 
> <https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53DF7748.2090308%40gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Any experience with ES and Data Compressing Filesystems?

horst knete
Hi,

i wasn´t really that aware that this could only led to higher usage of CPU and RAM, but to say so, the cpu load has indeed increased by about 20-30% compared to not compressing the storage. The RAM usage didnt increase by a big deal.

IMHO a bit higher CPU-load is definietly worth it, if you could save about 60% of your hard disk space - every financial manager would agree.

Am Montag, 4. August 2014 14:06:47 UTC+2 schrieb Jörg Prante:
You are aware of the fact the kind of search performance you mean
depends on RAM and virtual memory organization of the cluster, not on
storage, so "without any siginifcant performace losses" could be expected ?

Jörg

Am 04.08.14 12:41, schrieb horst knete:

> We are indexing all sort of events (Windows, Linux, Apache, Netflow
> and so on...) and impact is defined in speed of the Kibana GUI / how
> long it takes to load 7 or 14 days of data. Thats what is important
> for my colleagues.
>
>
> Am Montag, 4. August 2014 10:52:25 UTC+2 schrieb Mark Walkom:
>
>     What sort of data are you indexing? When you said performance
>     impact was minimal, how minimal and at what points are you seeing it?
>
>     Regards,
>     Mark Walkom
>
>     Infrastructure Engineer
>     Campaign Monitor
>     email: [hidden email]
>     web: <a href="http://www.campaignmonitor.com" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;">www.campaignmonitor.com <<a href="http://www.campaignmonitor.com" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;">http://www.campaignmonitor.com>
>
>
>     On 4 August 2014 16:43, horst knete <[hidden email]> wrote:
>
>         Hi again,
>
>         a quick report regarding compression:
>
>         we are using a 3-TB btrfs-volume with 32k block size now which
>         reduced the amount of data from 3,2 TB to 1,1TB without any
>         segnificant performance losses ( we are using a 8 CPU, 20 GB
>         Memory machine with an iSCSI.Link to the volume ).
>
>         So for us i can only suggest using the btrfs-volume for long
>         term storage.
>
>         Am Montag, 21. Juli 2014 08:48:12 UTC+2 schrieb Patrick
>         Proniewski:
>
>             Hi,
>
>             gzip/zlib compression is very bad for performance, so it
>             can be interesting for closed indices, but for live data I
>             would not recommend it.
>             Also, you must know that:
>
>             Compression using lz4 is already enabled into indices,
>             ES/Lucene/Java usually read&write 4k blocks,
>
>             -> hence, compression is achieved on 4k blocks. If your
>             filesystem uses 4k blocks and you add FS compression, you
>             will probably have a very small gain, if any. I've tried
>             on ZFS:
>
>             Filesystem             Size    Used   Avail Capacity
>              Mounted on
>             zdata/ES-lz4           1.1T    1.9G    1.1T 0%  
>              /zdata/ES-lz4
>             zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES
>
>             If you are using a larger block size, like 128k, a
>             compressed filesystem does show some benefit:
>
>             Filesystem             Size    Used   Avail Capacity
>              Mounted on
>             zdata/ES-lz4           1.1T    1.1G    1.1T 0%  
>              /zdata/ES-lz4        -> compressratio  1.73x
>             zdata/ES-gzip          1.1T    901M    1.1T 0%  
>              /zdata/ES-gzip        -> compressratio  2.27x
>             zdata/ES               1.1T    1.9G    1.1T 0%    /zdata/ES
>
>             But a file system block larger than 4k is very suboptimal
>             for IO (ES read or write one 4k block -> your FS must read
>             or write a 128k block).
>
>             On 21 juil. 2014, at 07:58, horst knete
>             <[hidden email]> wrote:
>
>             > Hey guys,
>             >
>             > we have mounted an btrfs file system with the
>             compression method "zlib" for
>             > testing purposes on our elasticsearchserver and copied
>             one of the indices
>             > on the btrfs volume, unfortunately it had no success and
>             still got the size
>             > of 50gb :/
>             >
>             > I will further try it with other compression methods and
>             will report here
>             >
>             > Am Samstag, 19. Juli 2014 07:21:20 UTC+2 schrieb Otis
>             Gospodnetic:
>             >>
>             >> Hi Horst,
>             >>
>             >> I wouldn't bother with this for the reasons Joerg
>             mentioned, but should
>             >> you try it anyway, I'd love to hear your
>             findings/observations.
>             >>
>             >> Otis
>             >> --
>             >> Performance Monitoring * Log Analytics * Search Analytics
>             >> Solr & Elasticsearch Support * <a href="http://sematext.com/" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fsematext.com%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNFOz7jzL4dgjz1lPl99mo_THPxEYg';return true;">http://sematext.com/
>             >>
>             >>
>             >>
>             >> On Wednesday, July 16, 2014 6:56:36 AM UTC-4, horst
>             knete wrote:
>             >>>
>             >>> Hey Guys,
>             >>>
>             >>> to save a lot of hard disk space, we are going to use
>             an compression file
>             >>> system, which allows us transparent compression for
>             the es-indices. (It
>             >>> seems like es-indices are very good compressable, got
>             up to 65%
>             >>> compression-rate in some tests).
>             >>>
>             >>> Currently the indices are laying at a ext4-Linux
>             Filesystem which
>             >>> unfortunately dont have the transparent compression
>             ability.
>             >>>
>             >>> Anyone of you got experience with compression file
>             systems like BTRFS or
>             >>> ZFS/OpenZFS and can tell us if this led to big
>             performance losses?
>             >>>
>             >>> Thanks for responding
>
>         --
>         You received this message because you are subscribed to the
>         Google Groups "elasticsearch" group.
>         To unsubscribe from this group and stop receiving emails from
>         it, send an email to elasticsearc...@googlegroups.com.
>         To view this discussion on the web visit
>         <a href="https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com';return true;">https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com
>         <<a href="https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/1f9bf509-b185-4c66-99c5-d8f69e95bea8%40googlegroups.com?utm_medium=email&utm_source=footer>.
>         For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout
>         <<a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout>.
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="pQixhD0MEDEJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com
> <mailto:<a href="javascript:" target="_blank" gdf-obfuscated-mailto="pQixhD0MEDEJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearch+unsubscribe@...>.
> To view this discussion on the web visit
> <a href="https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com';return true;">https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com
> <<a href="https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/211eebfe-768f-4ea9-a1c9-2c93b870e464%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/10eb25ae-82c5-4707-9f93-abe99b301840%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.