Nesting more than one level of child parent

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Nesting more than one level of child parent

eranid
I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from that book.

However, when indexing more than two-levels there is a problem with has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a "has_parent" query on the lowest level (characters) or a has_child query on the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so will reside together with his parent (author), but a character gets indexed to a shard based on the hash of the book id, which does not necessarily complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not necessarily reside in the same shard (kind of crippling the whole child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need for complex queries such as "what authors wrote books with female characters" for example.

I mad a gist showing the problem, at:
https://gist.github.com/eranid/5299628
Reply | Threaded
Open this post in threaded view
|

Re: Nesting more than one level of child parent

Martijn v Groningen
If you're indexing mulit-level parent child documents you need to use the `routing` query string option in addition to the `parent` query string option. The `routing` will always contain the id of the first hierarchy level (author in your case). This way all books from the same author and characters of these books always reside on the same shard.


On 3 April 2013 11:16, eranid <[hidden email]> wrote:
I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
https://gist.github.com/eranid/5299628



--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.





--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Nesting more than one level of child parent

eranid
Great, Thanks a lot.



On Wed, Apr 3, 2013 at 1:01 PM, Martijn v Groningen [via ElasticSearch Users] <[hidden email]> wrote:
If you're indexing mulit-level parent child documents you need to use the `routing` query string option in addition to the `parent` query string option. The `routing` will always contain the id of the first hierarchy level (author in your case). This way all books from the same author and characters of these books always reside on the same shard.


On 3 April 2013 11:16, eranid <[hidden email]> wrote:
I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
https://gist.github.com/eranid/5299628



--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.





--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032827.html
To unsubscribe from Nesting more than one level of child parent, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Nesting more than one level of child parent

q42jaap

In the case of books an characters, I could suggest to nest characters in book documents, but I'm guessing your real application isn't about books ;-)

@Martijn, maybe you could add the routing fix to the docs where parent parameters is explained?

On Apr 3, 2013 2:47 PM, "eranid" <[hidden email]> wrote:
Great, Thanks a lot.



On Wed, Apr 3, 2013 at 1:01 PM, Martijn v Groningen [via ElasticSearch Users] <[hidden email]> wrote:
If you're indexing mulit-level parent child documents you need to use the `routing` query string option in addition to the `parent` query string option. The `routing` will always contain the id of the first hierarchy level (author in your case). This way all books from the same author and characters of these books always reside on the same shard.


On 3 April 2013 11:16, eranid <[hidden email]> wrote:
I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
https://gist.github.com/eranid/5299628



--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.





--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032827.html
To unsubscribe from Nesting more than one level of child parent, click here.
NAML



View this message in context: Re: Nesting more than one level of child parent
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Nesting more than one level of child parent

Martijn v Groningen
@Jaap Makes sense, I'll update the docs.


On 3 April 2013 20:41, Jaap Taal <[hidden email]> wrote:

In the case of books an characters, I could suggest to nest characters in book documents, but I'm guessing your real application isn't about books ;-)

@Martijn, maybe you could add the routing fix to the docs where parent parameters is explained?

On Apr 3, 2013 2:47 PM, "eranid" <[hidden email]> wrote:
Great, Thanks a lot.



On Wed, Apr 3, 2013 at 1:01 PM, Martijn v Groningen [via ElasticSearch Users] <[hidden email]> wrote:
If you're indexing mulit-level parent child documents you need to use the `routing` query string option in addition to the `parent` query string option. The `routing` will always contain the id of the first hierarchy level (author in your case). This way all books from the same author and characters of these books always reside on the same shard.


On 3 April 2013 11:16, eranid <[hidden email]> wrote:
I need to index 3 levels (or more) of child-parent.
For example, the levels might be an author, a book, and characters from that
book.

However, when indexing more than two-levels there is a problem with
has_child and has_parent queries and filters.
If I have 5 shards, I get about one fifth of the results when running a
"has_parent" query on the lowest level (characters) or a has_child query on
the second level(books).

My guess is that a book gets indexed to a shard by it's parent id and so
will reside together with his parent (author), but a character gets indexed
to a shard based on the hash of the book id, which does not necessarily
complies with the actual shard the book was indexed on.

And so, this means that all character of books of the same author do not
necessarily reside in the same shard (kind of crippling the whole
child-parent advantage really).

Am I doing something wrong? How can I resolve this, as I am in real need for
complex queries such as "what authors wrote books with female characters"
for example.

I mad a gist showing the problem, at:
https://gist.github.com/eranid/5299628



--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.





--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



If you reply to this email, your message will be added to the discussion below:
http://elasticsearch-users.115913.n3.nabble.com/Nesting-more-than-one-level-of-child-parent-tp4032822p4032827.html
To unsubscribe from Nesting more than one level of child parent, click here.
NAML



View this message in context: Re: Nesting more than one level of child parent
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.