Quantcast

Integer size vs Long size

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Integer size vs Long size

ryanogle
So I have a test index where I define a field1 as type: long.  I then define field2 as type:integer.  I index 2 documents, each with array of 1000 numbers that are 9 digits in length.  In the first document, the array is put in the field1.  In the second document, the array is put in the field2.

I then go to _stats and look at the document sizes.

I would have expected the document size of the one with the type:integer to be half the size of the one with type:long as the integer is a 32 bit type, and long is a 64 bit type.  But both documents are almost exactly the same size.  And from my math (each document is around 9K, and 64bit = 8bytes*9000 characters = 7.2K), it seems that they are all being indexed as long, 64bit.  

I double checked the _mapping, and the fields are definitely set as long & integer respectively.  Any idea why the document indexed with field type:integer wouldn't be far less in size than the one with type:long?

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Integer size vs Long size

joergprante@gmail.com
Have you disabled _source and _all for your test?

Note, Lucene is an inverted index, it is not behaving like a bag of
documents of primitive data types. In spite there are field types like
LongField, IntField, DoubleField, FloatField for numerics, this does not
determine the overall size of the index files. To simplify, imagine a
list of pointers pointing to longs, and a list of pointers pointing to
ints. These posting list elements uses the same memory size, no matter
what kind of fields you have in a document.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Integer size vs Long size

simonw-2
In reply to this post by ryanogle
Lucene doesn't know about types under the hood. We index numeric types as prefix coded tries to make range queries efficient. The number of bytes a long / int value takes in the index is depending on the precision_step that is used. but that is the data in the term index. if you are curious about the stored document size, we only know about String / UTF-8 Bytes so we don't store this in the most efficient way a Database would do  in a dedicated column. I don't think you can compare the index size to ensure that the right type is applied, I am afraid!

simon

On Thursday, February 14, 2013 2:25:21 AM UTC+1, ryano wrote:
So I have a test index where I define a field1 as type: long.  I then define field2 as type:integer.  I index 2 documents, each with array of 1000 numbers that are 9 digits in length.  In the first document, the array is put in the field1.  In the second document, the array is put in the field2.

I then go to _stats and look at the document sizes.

I would have expected the document size of the one with the type:integer to be half the size of the one with type:long as the integer is a 32 bit type, and long is a 64 bit type.  But both documents are almost exactly the same size.  And from my math (each document is around 9K, and 64bit = 8bytes*9000 characters = 7.2K), it seems that they are all being indexed as long, 64bit.  

I double checked the _mapping, and the fields are definitely set as long & integer respectively.  Any idea why the document indexed with field type:integer wouldn't be far less in size than the one with type:long?

Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Loading...