Filtered query vs using filter outside ?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Filtered query vs using filter outside ?

bcoder
What is the difference between these two:

Case 1:
{
    "query": {
        "filtered": {
            "filter": {
                "prefix": {
                    "name": "blah"
                }
            },
            "query": {
                "term": {
                    "dept": "engineering"
                }
            }
        }
    }
}

AND,
Case 2:
{
    "query": {
        "term": {
            "dept": "engineering"
        }
    },
    "filter": {
        "prefix": {
            "name": "blah"
        }
    }
}

i..e, is there a difference if I put the filter and query together
inside a "filtered" query, OR, if i put them separately like Case 2
(i.e., a "query", and a "filter"). Are there any performance gains in
either of the two ?

Note: In my case I also want to get facets on a particular field, and
use exactly same filters (as my query) for facets as well (does that
make either of the two cases more desirable). Currently I am doing
something like this:

{
    "query": {
        "term": {
            "dept": "engineering"
        }
    },
    "filter": {
        "prefix": {
            "name": "blah"
        }
    }
    facets": {
        "type_facet": {
            "terms": {
                "field": "type",
                "size": 100
            },
            "facet_filter": {
                // Basically I repeat the exact same filter (at top
level) here as well.
                "prefix": {
                    "name": "blah"
                }
            }
        }
    }
}

Is this the optimal way to do it ? Is there a better way then to
repeat the filter at N+1 places in query for N facets, i.e., 1 for top-
level query and N for facet_filter for each facet. Will using
'filtered" help me somehow in performance ?

Thanks in advance!
Reply | Threaded
Open this post in threaded view
|

Re: Filtered query vs using filter outside ?

kimchy
Administrator
Its explained here: http://www.elasticsearch.org/guide/reference/api/search/filter.html. The idea of the search filter is the filter documents from the result, but not affect the facets. Facets run on the "query" element, further filtered with an optional facet_filter.

In your case, if you place the filter in a filtered query within the "query" element, "everything" will be filtered by it, facets and hits.

On Thu, May 3, 2012 at 9:33 PM, bcoder <[hidden email]> wrote:
What is the difference between these two:

Case 1:
{
   "query": {
       "filtered": {
           "filter": {
               "prefix": {
                   "name": "blah"
               }
           },
           "query": {
               "term": {
                   "dept": "engineering"
               }
           }
       }
   }
}

AND,
Case 2:
{
   "query": {
       "term": {
           "dept": "engineering"
       }
   },
   "filter": {
       "prefix": {
           "name": "blah"
       }
   }
}

i..e, is there a difference if I put the filter and query together
inside a "filtered" query, OR, if i put them separately like Case 2
(i.e., a "query", and a "filter"). Are there any performance gains in
either of the two ?

Note: In my case I also want to get facets on a particular field, and
use exactly same filters (as my query) for facets as well (does that
make either of the two cases more desirable). Currently I am doing
something like this:

{
   "query": {
       "term": {
           "dept": "engineering"
       }
   },
   "filter": {
       "prefix": {
           "name": "blah"
       }
   }
   facets": {
       "type_facet": {
           "terms": {
               "field": "type",
               "size": 100
           },
           "facet_filter": {
               // Basically I repeat the exact same filter (at top
level) here as well.
               "prefix": {
                   "name": "blah"
               }
           }
       }
   }
}

Is this the optimal way to do it ? Is there a better way then to
repeat the filter at N+1 places in query for N facets, i.e., 1 for top-
level query and N for facet_filter for each facet. Will using
'filtered" help me somehow in performance ?

Thanks in advance!

Reply | Threaded
Open this post in threaded view
|

Re: Filtered query vs using filter outside ?

bcoder
Thanks a lot! :)

On May 4, 6:37 am, Shay Banon <[hidden email]> wrote:

> Its explained here:http://www.elasticsearch.org/guide/reference/api/search/filter.html. The
> idea of the search filter is the filter documents from the result, but not
> affect the facets. Facets run on the "query" element, further filtered with
> an optional facet_filter.
>
> In your case, if you place the filter in a filtered query within the
> "query" element, "everything" will be filtered by it, facets and hits.
>
>
>
>
>
>
>
> On Thu, May 3, 2012 at 9:33 PM, bcoder <[hidden email]> wrote:
> > What is the difference between these two:
>
> > Case 1:
> > {
> >    "query": {
> >        "filtered": {
> >            "filter": {
> >                "prefix": {
> >                    "name": "blah"
> >                }
> >            },
> >            "query": {
> >                "term": {
> >                    "dept": "engineering"
> >                }
> >            }
> >        }
> >    }
> > }
>
> > AND,
> > Case 2:
> > {
> >    "query": {
> >        "term": {
> >            "dept": "engineering"
> >        }
> >    },
> >    "filter": {
> >        "prefix": {
> >            "name": "blah"
> >        }
> >    }
> > }
>
> > i..e, is there a difference if I put the filter and query together
> > inside a "filtered" query, OR, if i put them separately like Case 2
> > (i.e., a "query", and a "filter"). Are there any performance gains in
> > either of the two ?
>
> > Note: In my case I also want to get facets on a particular field, and
> > use exactly same filters (as my query) for facets as well (does that
> > make either of the two cases more desirable). Currently I am doing
> > something like this:
>
> > {
> >    "query": {
> >        "term": {
> >            "dept": "engineering"
> >        }
> >    },
> >    "filter": {
> >        "prefix": {
> >            "name": "blah"
> >        }
> >    }
> >    facets": {
> >        "type_facet": {
> >            "terms": {
> >                "field": "type",
> >                "size": 100
> >            },
> >            "facet_filter": {
> >                // Basically I repeat the exact same filter (at top
> > level) here as well.
> >                "prefix": {
> >                    "name": "blah"
> >                }
> >            }
> >        }
> >    }
> > }
>
> > Is this the optimal way to do it ? Is there a better way then to
> > repeat the filter at N+1 places in query for N facets, i.e., 1 for top-
> > level query and N for facet_filter for each facet. Will using
> > 'filtered" help me somehow in performance ?
>
> > Thanks in advance!