Some interesting storage numbers for people interested.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Some interesting storage numbers for people interested.

John Smith
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Some interesting storage numbers for people interested.

Jack Park-2
What is lost (the tradeoff) when _source is disabled?
What is lost when _all is disabled?

This is interesting!

Thanks
Jack


On Mon, Feb 23, 2015 at 12:10 PM, John Smith <[hidden email]> wrote:
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Some interesting storage numbers for people interested.

Mark Walkom-2
Thanks John, this is a really interesting test.


If you have no _source you cannot reindex or view the actual raw content that was sent to ES, only the analysed portions you keep.
No _all means you have to know the exact field you want to search on or else you may get no results, as ES will search _all by default (think of it as a shortcut search field).


As an aside, we are working on adding a new compression algorithm for ES which will also improve storage capacity.

On 24 February 2015 at 07:27, Jack Park <[hidden email]> wrote:
What is lost (the tradeoff) when _source is disabled?
What is lost when _all is disabled?

This is interesting!

Thanks
Jack


On Mon, Feb 23, 2015 at 12:10 PM, John Smith <[hidden email]> wrote:
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Some interesting storage numbers for people interested.

Jack Park-2
Thank you very much, Mark.

On Mon, Feb 23, 2015 at 12:54 PM, Mark Walkom <[hidden email]> wrote:
Thanks John, this is a really interesting test.


If you have no _source you cannot reindex or view the actual raw content that was sent to ES, only the analysed portions you keep.
No _all means you have to know the exact field you want to search on or else you may get no results, as ES will search _all by default (think of it as a shortcut search field).


As an aside, we are working on adding a new compression algorithm for ES which will also improve storage capacity.

On 24 February 2015 at 07:27, Jack Park <[hidden email]> wrote:
What is lost (the tradeoff) when _source is disabled?
What is lost when _all is disabled?

This is interesting!

Thanks
Jack


On Mon, Feb 23, 2015 at 12:10 PM, John Smith <[hidden email]> wrote:
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fw6%2BE17PEBhq%3DMH7YeHME6tzEqgofwZ7Qnx_BmQf3afkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Some interesting storage numbers for people interested.

John Smith
Yeah sorry should have mentioned the tradeoffs.

I think there some interesting use-cases here for instance if you are building a pure analytics dashboard where it's 100% aggregations then you can save allot of space with _source: false, _all: false

In my case I'm opting for _source: true, _all: false. Since I need to re-index a document but don't care about the all search. My users are required to specify the field they want to search by specifying the field through a drop down... So it's good for the 25% saving

On Monday, 23 February 2015 17:04:35 UTC-5, Jack Park wrote:
Thank you very much, Mark.

On Mon, Feb 23, 2015 at 12:54 PM, Mark Walkom <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="UQqtKHd8JBYJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">markw...@...> wrote:
Thanks John, this is a really interesting test.


If you have no _source you cannot reindex or view the actual raw content that was sent to ES, only the analysed portions you keep.
No _all means you have to know the exact field you want to search on or else you may get no results, as ES will search _all by default (think of it as a shortcut search field).


As an aside, we are working on adding a new compression algorithm for ES which will also improve storage capacity.

On 24 February 2015 at 07:27, Jack Park <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="UQqtKHd8JBYJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">jack...@...> wrote:
What is lost (the tradeoff) when _source is disabled?
What is lost when _all is disabled?

This is interesting!

Thanks
Jack


On Mon, Feb 23, 2015 at 12:10 PM, John Smith <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="UQqtKHd8JBYJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">java.d...@...> wrote:
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="UQqtKHd8JBYJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="UQqtKHd8JBYJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com.

For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="UQqtKHd8JBYJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com.

For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/36fb0b2f-3b80-4279-8f96-efba421cab51%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Some interesting storage numbers for people interested.

John Smith
Mark when you say you cannot re-index the document you mean re-index within the cluster? But if we resubmit the document using the index API it will get re-indexed and updated version 2 right?

So Elastic search will mark the document to be deleted from the segment and eventually merge the "updated" data?

On Monday, 23 February 2015 17:19:49 UTC-5, John Smith wrote:
Yeah sorry should have mentioned the tradeoffs.

I think there some interesting use-cases here for instance if you are building a pure analytics dashboard where it's 100% aggregations then you can save allot of space with _source: false, _all: false

In my case I'm opting for _source: true, _all: false. Since I need to re-index a document but don't care about the all search. My users are required to specify the field they want to search by specifying the field through a drop down... So it's good for the 25% saving

On Monday, 23 February 2015 17:04:35 UTC-5, Jack Park wrote:
Thank you very much, Mark.

On Mon, Feb 23, 2015 at 12:54 PM, Mark Walkom <[hidden email]> wrote:
Thanks John, this is a really interesting test.


If you have no _source you cannot reindex or view the actual raw content that was sent to ES, only the analysed portions you keep.
No _all means you have to know the exact field you want to search on or else you may get no results, as ES will search _all by default (think of it as a shortcut search field).


As an aside, we are working on adding a new compression algorithm for ES which will also improve storage capacity.

On 24 February 2015 at 07:27, Jack Park <[hidden email]> wrote:
What is lost (the tradeoff) when _source is disabled?
What is lost when _all is disabled?

This is interesting!

Thanks
Jack


On Mon, Feb 23, 2015 at 12:10 PM, John Smith <[hidden email]> wrote:
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com.

For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com.

For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2ab6c36-6199-48aa-88b9-ef007dc204c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Some interesting storage numbers for people interested.

Mark Walkom-2
Yes, sorry. You can definitely reindex from an external source :)

On 24 February 2015 at 09:33, John Smith <[hidden email]> wrote:
Mark when you say you cannot re-index the document you mean re-index within the cluster? But if we resubmit the document using the index API it will get re-indexed and updated version 2 right?

So Elastic search will mark the document to be deleted from the segment and eventually merge the "updated" data?


On Monday, 23 February 2015 17:19:49 UTC-5, John Smith wrote:
Yeah sorry should have mentioned the tradeoffs.

I think there some interesting use-cases here for instance if you are building a pure analytics dashboard where it's 100% aggregations then you can save allot of space with _source: false, _all: false

In my case I'm opting for _source: true, _all: false. Since I need to re-index a document but don't care about the all search. My users are required to specify the field they want to search by specifying the field through a drop down... So it's good for the 25% saving

On Monday, 23 February 2015 17:04:35 UTC-5, Jack Park wrote:
Thank you very much, Mark.

On Mon, Feb 23, 2015 at 12:54 PM, Mark Walkom <[hidden email]> wrote:
Thanks John, this is a really interesting test.


If you have no _source you cannot reindex or view the actual raw content that was sent to ES, only the analysed portions you keep.
No _all means you have to know the exact field you want to search on or else you may get no results, as ES will search _all by default (think of it as a shortcut search field).


As an aside, we are working on adding a new compression algorithm for ES which will also improve storage capacity.

On 24 February 2015 at 07:27, Jack Park <[hidden email]> wrote:
What is lost (the tradeoff) when _source is disabled?
What is lost when _all is disabled?

This is interesting!

Thanks
Jack


On Mon, Feb 23, 2015 at 12:10 PM, John Smith <[hidden email]> wrote:
I don't run a blog but I thought I would share some results with the community.

Using Elasticsearch 1.4.3

I wanted to test the various ways we could save some storage on our ES index and here are some numbers

Created 6 different indexes with the various mapping settings.
Each index containing 4 types.
Insert 100,000 documents per type so total 400,000 per index.
Average document size 300-400 bytes.

The values represent the total primary space taken by each index based on the different mapping settings.

_source: true = 45MB 
_source: true, _all: false = 34MB
_source: false = 30MB
_source: false, _all: false = 18MB
_source: false, store: true (all fields) = 39.5MB
_source: false, store: true (all fields), _all: false = 28.5MB

As you can see the default _source setting takes the most space, while disabling the _source and _all field saves the most space.



--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/423ea99b-b9f2-4551-bb0c-d0167ed52150%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAH6s0fwgm%3DTJqN7Vqu7v3yUxg7OKz20rrpRgSx6HpApBRzWgpw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-JaJOZ9bB7j62K1Y74QNGPZViYpNtpJWuU3nh-myUFfw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c2ab6c36-6199-48aa-88b9-ef007dc204c7%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-srv10LBFWBmjv%2BtO%2BisVLf1FkhkpdVX-9XRpXUrfj5w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.