Parent and children have to be in the same index?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Parent and children have to be in the same index?

Robin Boutros
I read this somewhere:

By limiting itself to parent/child type relationships elasticsearch makes life easier for itself: a child is always indexed in the same shard as its parent, so has_child doesn’t have to do awkward cross shard operations.

I just want to be 100% sure that it prevents me from doing what I want:

- I have Items that are indexed in different indexes (one per type).
- I have Players (right now stored in mysql, but I would move to ES)

Items belong to players, and I need to return items based on a property of the players, which is a use case for the parent/child relationship.
Am I right to say there is NO WAY to achieve this if items and players are not stored in the same index?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Parent and children have to be in the same index?

q42jaap

You could do the "join" yourself, i.e. query the parent ids and use the list to filter/query the children.

The problematic  thing is that you'll need *all* the parent ids, for which you should do a scan.
And then you send a large list of ids to all the shards, which could be slow.

If you use the has_parent query on the other hand, you avoid all the above, but it needs the condition that parents and children are in the same index, to avoid round trips in elasticsearch internally. Each shard can do their part of the work and merging is just as normal.

Jaap

On Apr 20, 2013 8:16 PM, "Robin Boutros" <[hidden email]> wrote:
I read this somewhere:

By limiting itself to parent/child type relationships elasticsearch makes life easier for itself: a child is always indexed in the same shard as its parent, so has_child doesn’t have to do awkward cross shard operations.

I just want to be 100% sure that it prevents me from doing what I want:

- I have Items that are indexed in different indexes (one per type).
- I have Players (right now stored in mysql, but I would move to ES)

Items belong to players, and I need to return items based on a property of the players, which is a use case for the parent/child relationship.
Am I right to say there is NO WAY to achieve this if items and players are not stored in the same index?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Parent and children have to be in the same index?

q42jaap
In reply to this post by Robin Boutros

If you only need to display items for 1 player at the time, you don't need to use "join" at all.

Forgot to add that,

Jaap

On Apr 20, 2013 8:16 PM, "Robin Boutros" <[hidden email]> wrote:
I read this somewhere:

By limiting itself to parent/child type relationships elasticsearch makes life easier for itself: a child is always indexed in the same shard as its parent, so has_child doesn’t have to do awkward cross shard operations.

I just want to be 100% sure that it prevents me from doing what I want:

- I have Items that are indexed in different indexes (one per type).
- I have Players (right now stored in mysql, but I would move to ES)

Items belong to players, and I need to return items based on a property of the players, which is a use case for the parent/child relationship.
Am I right to say there is NO WAY to achieve this if items and players are not stored in the same index?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Parent and children have to be in the same index?

Robin Boutros
In reply to this post by q42jaap
It would be a popular query, so there's no way I'm going to be able to get an array of say 10k ids, and then use it to filter items.

Does the fact that everything has to be in one index come from the distributed nature of elasticsearch? I wish you could force it to work with several indexes, after all children and parents are completely separate entities.

Now I'm torn between reindexing my 25 indexes into one, which would probably hurt performances since I VERY often have to search in a small subset of these or just not implement this feature...

Thanks for your answer :)

On Sunday, April 21, 2013 4:20:56 AM UTC-4, Jaap Taal wrote:

You could do the "join" yourself, i.e. query the parent ids and use the list to filter/query the children.

The problematic  thing is that you'll need *all* the parent ids, for which you should do a scan.
And then you send a large list of ids to all the shards, which could be slow.

If you use the has_parent query on the other hand, you avoid all the above, but it needs the condition that parents and children are in the same index, to avoid round trips in elasticsearch internally. Each shard can do their part of the work and merging is just as normal.

Jaap

On Apr 20, 2013 8:16 PM, "Robin Boutros" <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="J5yRS2GJH40J">niu...@...> wrote:
I read this somewhere:

By limiting itself to parent/child type relationships elasticsearch makes life easier for itself: a child is always indexed in the same shard as its parent, so has_child doesn’t have to do awkward cross shard operations.

I just want to be 100% sure that it prevents me from doing what I want:

- I have Items that are indexed in different indexes (one per type).
- I have Players (right now stored in mysql, but I would move to ES)

Items belong to players, and I need to return items based on a property of the players, which is a use case for the parent/child relationship.
Am I right to say there is NO WAY to achieve this if items and players are not stored in the same index?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="J5yRS2GJH40J">elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Parent and children have to be in the same index?

q42jaap

It's because of the distributed nature of elasticsearch.

The reason is to avoid difficult round-trips between shards. If this is really needed you could make a plugin which does the joining for you...

On Apr 21, 2013 4:36 PM, "Robin Boutros" <[hidden email]> wrote:
It would be a popular query, so there's no way I'm going to be able to get an array of say 10k ids, and then use it to filter items.

Does the fact that everything has to be in one index come from the distributed nature of elasticsearch? I wish you could force it to work with several indexes, after all children and parents are completely separate entities.

Now I'm torn between reindexing my 25 indexes into one, which would probably hurt performances since I VERY often have to search in a small subset of these or just not implement this feature...

Thanks for your answer :)

On Sunday, April 21, 2013 4:20:56 AM UTC-4, Jaap Taal wrote:

You could do the "join" yourself, i.e. query the parent ids and use the list to filter/query the children.

The problematic  thing is that you'll need *all* the parent ids, for which you should do a scan.
And then you send a large list of ids to all the shards, which could be slow.

If you use the has_parent query on the other hand, you avoid all the above, but it needs the condition that parents and children are in the same index, to avoid round trips in elasticsearch internally. Each shard can do their part of the work and merging is just as normal.

Jaap

On Apr 20, 2013 8:16 PM, "Robin Boutros" <[hidden email]> wrote:
I read this somewhere:

By limiting itself to parent/child type relationships elasticsearch makes life easier for itself: a child is always indexed in the same shard as its parent, so has_child doesn’t have to do awkward cross shard operations.

I just want to be 100% sure that it prevents me from doing what I want:

- I have Items that are indexed in different indexes (one per type).
- I have Players (right now stored in mysql, but I would move to ES)

Items belong to players, and I need to return items based on a property of the players, which is a use case for the parent/child relationship.
Am I right to say there is NO WAY to achieve this if items and players are not stored in the same index?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Parent and children have to be in the same index?

Robin Boutros
Ok thanks. I guess I should consider an index as a complete database, and not like a table. That might help me get over the fact that I have to put everything into one index ^^.

On Sunday, April 21, 2013 1:25:28 PM UTC-4, Jaap Taal wrote:

It's because of the distributed nature of elasticsearch.

The reason is to avoid difficult round-trips between shards. If this is really needed you could make a plugin which does the joining for you...

On Apr 21, 2013 4:36 PM, "Robin Boutros" <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="N5AU2_KmDJEJ">niu...@...> wrote:
It would be a popular query, so there's no way I'm going to be able to get an array of say 10k ids, and then use it to filter items.

Does the fact that everything has to be in one index come from the distributed nature of elasticsearch? I wish you could force it to work with several indexes, after all children and parents are completely separate entities.

Now I'm torn between reindexing my 25 indexes into one, which would probably hurt performances since I VERY often have to search in a small subset of these or just not implement this feature...

Thanks for your answer :)

On Sunday, April 21, 2013 4:20:56 AM UTC-4, Jaap Taal wrote:

You could do the "join" yourself, i.e. query the parent ids and use the list to filter/query the children.

The problematic  thing is that you'll need *all* the parent ids, for which you should do a scan.
And then you send a large list of ids to all the shards, which could be slow.

If you use the has_parent query on the other hand, you avoid all the above, but it needs the condition that parents and children are in the same index, to avoid round trips in elasticsearch internally. Each shard can do their part of the work and merging is just as normal.

Jaap

On Apr 20, 2013 8:16 PM, "Robin Boutros" <[hidden email]> wrote:
I read this somewhere:

By limiting itself to parent/child type relationships elasticsearch makes life easier for itself: a child is always indexed in the same shard as its parent, so has_child doesn’t have to do awkward cross shard operations.

I just want to be 100% sure that it prevents me from doing what I want:

- I have Items that are indexed in different indexes (one per type).
- I have Players (right now stored in mysql, but I would move to ES)

Items belong to players, and I need to return items based on a property of the players, which is a use case for the parent/child relationship.
Am I right to say there is NO WAY to achieve this if items and players are not stored in the same index?

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="N5AU2_KmDJEJ">elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.