Quering lots of small objects

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Quering lots of small objects

Alexis Okuwa
Hello,

I have a use case where i have lots of small objects each being about 500bytes. There are a lot of them being written and will need to get purged. I was wondering if I could setup a single index and type it by day. But then run the query on the index its self. So http://localhost:9200/bigIndex/_search. For search that i know which types i want to use will I be able to do http://localhost:9200/bigIndex/type1,type2,type3/_search. I also wanted to know if it would be useful to use routing. 

Lastly, I will be deleting these types on a daily base, each type will have over a million recoords, will the space get reused, And what are the draw backs to this. 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quering lots of small objects

Alexander Reelsen-2
Hey,

if you use different types, but put data in the same index, the purging of data will result in I/O intensive merge operations in order to keep the index small.
If you use own indices instead, you can simply drop a whole index after a week or two, which is a really cheap operation. More about index aliases

Hope this helps, otherwise you might need to tell a bit more about your usecase.


--Alex



On Wed, Jun 5, 2013 at 6:53 PM, Alexis Okuwa <[hidden email]> wrote:
Hello,

I have a use case where i have lots of small objects each being about 500bytes. There are a lot of them being written and will need to get purged. I was wondering if I could setup a single index and type it by day. But then run the query on the index its self. So http://localhost:9200/bigIndex/_search. For search that i know which types i want to use will I be able to do http://localhost:9200/bigIndex/type1,type2,type3/_search. I also wanted to know if it would be useful to use routing. 

Lastly, I will be deleting these types on a daily base, each type will have over a million recoords, will the space get reused, And what are the draw backs to this. 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quering lots of small objects

Alexis Okuwa
My use case are a few parts in a time series application i am building allows for a large amount of writes to come in. so for example for each processes running on a machine its one document that gets created, there is a comploent simular to google anyaltics and for each broswer page load they get it creates a doc, and then the user is also able to have some custom logging enabled, which will create docs on the services. I was going to create 1 index for each of these use cases, and then a type for daily, this is to keep the index mangment easier, data would present in its raw form for 30 to 90 days. which would mean a few hundred indexes if i did one index each.


On Thu, Jun 6, 2013 at 12:57 AM, Alexander Reelsen <[hidden email]> wrote:
Hey,

if you use different types, but put data in the same index, the purging of data will result in I/O intensive merge operations in order to keep the index small.
If you use own indices instead, you can simply drop a whole index after a week or two, which is a really cheap operation. More about index aliases

Hope this helps, otherwise you might need to tell a bit more about your usecase.


--Alex



On Wed, Jun 5, 2013 at 6:53 PM, Alexis Okuwa <[hidden email]> wrote:
Hello,

I have a use case where i have lots of small objects each being about 500bytes. There are a lot of them being written and will need to get purged. I was wondering if I could setup a single index and type it by day. But then run the query on the index its self. So http://localhost:9200/bigIndex/_search. For search that i know which types i want to use will I be able to do http://localhost:9200/bigIndex/type1,type2,type3/_search. I also wanted to know if it would be useful to use routing. 

Lastly, I will be deleting these types on a daily base, each type will have over a million recoords, will the space get reused, And what are the draw backs to this. 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/HR1BBM__yTw/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Enjoy,
Alexis Okuwa
WojonsTech
424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quering lots of small objects

kimchy
Administrator
types within an index are not designed to be used to separate time base data, in that case, you should have a single type and use timestamp to do ranges. Index per time range is an amazingly powerful design for time base data, as t doesn't suffer the cost of deletes...


On Thu, Jun 6, 2013 at 10:02 AM, Alexis Okuwa <[hidden email]> wrote:

My use case are a few parts in a time series application i am building allows for a large amount of writes to come in. so for example for each processes running on a machine its one document that gets created, there is a comploent simular to google anyaltics and for each broswer page load they get it creates a doc, and then the user is also able to have some custom logging enabled, which will create docs on the services. I was going to create 1 index for each of these use cases, and then a type for daily, this is to keep the index mangment easier, data would present in its raw form for 30 to 90 days. which would mean a few hundred indexes if i did one index each.


On Thu, Jun 6, 2013 at 12:57 AM, Alexander Reelsen <[hidden email]> wrote:
Hey,

if you use different types, but put data in the same index, the purging of data will result in I/O intensive merge operations in order to keep the index small.
If you use own indices instead, you can simply drop a whole index after a week or two, which is a really cheap operation. More about index aliases

Hope this helps, otherwise you might need to tell a bit more about your usecase.


--Alex



On Wed, Jun 5, 2013 at 6:53 PM, Alexis Okuwa <[hidden email]> wrote:
Hello,

I have a use case where i have lots of small objects each being about 500bytes. There are a lot of them being written and will need to get purged. I was wondering if I could setup a single index and type it by day. But then run the query on the index its self. So http://localhost:9200/bigIndex/_search. For search that i know which types i want to use will I be able to do http://localhost:9200/bigIndex/type1,type2,type3/_search. I also wanted to know if it would be useful to use routing. 

Lastly, I will be deleting these types on a daily base, each type will have over a million recoords, will the space get reused, And what are the draw backs to this. 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/HR1BBM__yTw/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Enjoy,
Alexis Okuwa
WojonsTech
424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quering lots of small objects

Alexis Okuwa
Is there anyway to make sure indexes with a simular name always have the same mapping?


On Fri, Jun 7, 2013 at 3:54 AM, Shay Banon <[hidden email]> wrote:
types within an index are not designed to be used to separate time base data, in that case, you should have a single type and use timestamp to do ranges. Index per time range is an amazingly powerful design for time base data, as t doesn't suffer the cost of deletes...


On Thu, Jun 6, 2013 at 10:02 AM, Alexis Okuwa <[hidden email]> wrote:

My use case are a few parts in a time series application i am building allows for a large amount of writes to come in. so for example for each processes running on a machine its one document that gets created, there is a comploent simular to google anyaltics and for each broswer page load they get it creates a doc, and then the user is also able to have some custom logging enabled, which will create docs on the services. I was going to create 1 index for each of these use cases, and then a type for daily, this is to keep the index mangment easier, data would present in its raw form for 30 to 90 days. which would mean a few hundred indexes if i did one index each.


On Thu, Jun 6, 2013 at 12:57 AM, Alexander Reelsen <[hidden email]> wrote:
Hey,

if you use different types, but put data in the same index, the purging of data will result in I/O intensive merge operations in order to keep the index small.
If you use own indices instead, you can simply drop a whole index after a week or two, which is a really cheap operation. More about index aliases

Hope this helps, otherwise you might need to tell a bit more about your usecase.


--Alex



On Wed, Jun 5, 2013 at 6:53 PM, Alexis Okuwa <[hidden email]> wrote:
Hello,

I have a use case where i have lots of small objects each being about 500bytes. There are a lot of them being written and will need to get purged. I was wondering if I could setup a single index and type it by day. But then run the query on the index its self. So http://localhost:9200/bigIndex/_search. For search that i know which types i want to use will I be able to do http://localhost:9200/bigIndex/type1,type2,type3/_search. I also wanted to know if it would be useful to use routing. 

Lastly, I will be deleting these types on a daily base, each type will have over a million recoords, will the space get reused, And what are the draw backs to this. 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/HR1BBM__yTw/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Enjoy,
Alexis Okuwa
WojonsTech
<a href="tel:424.835.1223" value="+14248351223" target="_blank">424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/HR1BBM__yTw/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Enjoy,
Alexis Okuwa
WojonsTech
424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Quering lots of small objects

Alexander Reelsen-2


On Sat, Jun 8, 2013 at 9:56 AM, Alexis Okuwa <[hidden email]> wrote:
Is there anyway to make sure indexes with a simular name always have the same mapping?


On Fri, Jun 7, 2013 at 3:54 AM, Shay Banon <[hidden email]> wrote:
types within an index are not designed to be used to separate time base data, in that case, you should have a single type and use timestamp to do ranges. Index per time range is an amazingly powerful design for time base data, as t doesn't suffer the cost of deletes...


On Thu, Jun 6, 2013 at 10:02 AM, Alexis Okuwa <[hidden email]> wrote:

My use case are a few parts in a time series application i am building allows for a large amount of writes to come in. so for example for each processes running on a machine its one document that gets created, there is a comploent simular to google anyaltics and for each broswer page load they get it creates a doc, and then the user is also able to have some custom logging enabled, which will create docs on the services. I was going to create 1 index for each of these use cases, and then a type for daily, this is to keep the index mangment easier, data would present in its raw form for 30 to 90 days. which would mean a few hundred indexes if i did one index each.


On Thu, Jun 6, 2013 at 12:57 AM, Alexander Reelsen <[hidden email]> wrote:
Hey,

if you use different types, but put data in the same index, the purging of data will result in I/O intensive merge operations in order to keep the index small.
If you use own indices instead, you can simply drop a whole index after a week or two, which is a really cheap operation. More about index aliases

Hope this helps, otherwise you might need to tell a bit more about your usecase.


--Alex



On Wed, Jun 5, 2013 at 6:53 PM, Alexis Okuwa <[hidden email]> wrote:
Hello,

I have a use case where i have lots of small objects each being about 500bytes. There are a lot of them being written and will need to get purged. I was wondering if I could setup a single index and type it by day. But then run the query on the index its self. So http://localhost:9200/bigIndex/_search. For search that i know which types i want to use will I be able to do http://localhost:9200/bigIndex/type1,type2,type3/_search. I also wanted to know if it would be useful to use routing. 

Lastly, I will be deleting these types on a daily base, each type will have over a million recoords, will the space get reused, And what are the draw backs to this. 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/HR1BBM__yTw/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Enjoy,
Alexis Okuwa
WojonsTech
<a href="tel:424.835.1223" value="+14248351223" target="_blank">424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/HR1BBM__yTw/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Enjoy,
Alexis Okuwa
WojonsTech
<a href="tel:424.835.1223" value="+14248351223" target="_blank">424.835.1223

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Loading...