retaining case in a faceted search

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

retaining case in a faceted search

csh-2
Is there a way to do faceted searches using the Search API AND
maintain case.  For example...

curl -X POST "http://localhost:9200/automobiles/automobile/_search?
pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
  ...
  "facets" : {
    "make" : {
      ...
      "terms" : [ {
        "term" : "bmw",
        "count" : 1654
      }, {
        "term" : "buick",
        "count" : 362
      }, {
      ...
      } ]
    }
  }

...but I want to retain the case ("BMW", "Buick").

Thanks in advance, Chuck
Reply | Threaded
Open this post in threaded view
|

Re: retaining case in a faceted search

Ivan Brusic
Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
your case is the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh <[hidden email]> wrote:

> Is there a way to do faceted searches using the Search API AND
> maintain case.  For example...
>
> curl -X POST "http://localhost:9200/automobiles/automobile/_search?
> pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
> { "terms" : {"field" : "make"} }}}'
>
> ...returns...
>
> {
>  ...
>  "facets" : {
>    "make" : {
>      ...
>      "terms" : [ {
>        "term" : "bmw",
>        "count" : 1654
>      }, {
>        "term" : "buick",
>        "count" : 362
>      }, {
>      ...
>      } ]
>    }
>  }
>
> ...but I want to retain the case ("BMW", "Buick").
>
> Thanks in advance, Chuck
Reply | Threaded
Open this post in threaded view
|

Re: retaining case in a faceted search

csh-2
Thanks for the quick response, Ivan!  Will look into how to do this
(don't tell me :-)), as I am an ES newbie...

On Feb 13, 9:33 am, Ivan Brusic <[hidden email]> wrote:

> Hi Chuck,
>
> When faceting on strings, they should either be not analyzed
> (preferred) or tokenized with a KeywordTokenizer. What is happening in
> your case is the terms are being indexed as lowercase by the default
> analyzer.
>
> --
> Ivan
>
>
>
>
>
>
>
> On Mon, Feb 13, 2012 at 9:14 AM, csh <[hidden email]> wrote:
> > Is there a way to do faceted searches using the Search API AND
> > maintain case.  For example...
>
> > curl -X POST "http://localhost:9200/automobiles/automobile/_search?
> > pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
> > { "terms" : {"field" : "make"} }}}'
>
> > ...returns...
>
> > {
> >  ...
> >  "facets" : {
> >    "make" : {
> >      ...
> >      "terms" : [ {
> >        "term" : "bmw",
> >        "count" : 1654
> >      }, {
> >        "term" : "buick",
> >        "count" : 362
> >      }, {
> >      ...
> >      } ]
> >    }
> >  }
>
> > ...but I want to retain the case ("BMW", "Buick").
>
> > Thanks in advance, Chuck
Reply | Threaded
Open this post in threaded view
|

Re: retaining case in a faceted search

kimchy
Administrator
Just in case you did not find out how, you need to explicitly define the mapping for that field to set index to not_analyzed. SEt the mapping in the create index API (simplest) when you create the index.

On Monday, February 13, 2012 at 9:02 PM, csh wrote:

Thanks for the quick response, Ivan! Will look into how to do this
(don't tell me :-)), as I am an ES newbie...

On Feb 13, 9:33 am, Ivan Brusic <i...@brusic.com> wrote:
Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
your case is the terms are being indexed as lowercase by the default
analyzer.

--
Ivan







On Mon, Feb 13, 2012 at 9:14 AM, csh <chuck....@gmail.com> wrote:
Is there a way to do faceted searches using the Search API AND
maintain case.  For example...

pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
 ...
 "facets" : {
   "make" : {
     ...
     "terms" : [ {
       "term" : "bmw",
       "count" : 1654
     }, {
       "term" : "buick",
       "count" : 362
     }, {
     ...
     } ]
   }
 }

...but I want to retain the case ("BMW", "Buick").

Thanks in advance, Chuck

Reply | Threaded
Open this post in threaded view
|

Re: retaining case in a faceted search

csh-2
In reply to this post by Ivan Brusic
I'm not quite getting the results I expect:  I think I'm indexing the
way you suggested...

curl -XPUT localhost:9200/cars?pretty=true -d '{"index" :
{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}'

...because after populating ES, the following query gives me the fully-
retained fields:

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

I can even do a query now in which I ask for all "makes" that end in
"n"...

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:*n" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

...and I get the right result:

{
  ...
  "facets" : {
    "make" : {
      "_type" : "terms",
      "missing" : 0,
      "total" : 3,
      "other" : 0,
      "terms" : [ {
        "term" : "Aston Martin",
        "count" : 2
      }, {
        "term" : "Nissan",
        "count" : 1
      } ]
    }
  }
}

However, if I ask for all "makes" that start with "a" or
"A" (q=make:A* or q=make:a*), I get no results (there should be
several--at least one as shown in the above example):

curl -X POST "http://localhost:9200/cars/car/_search?
pretty=true&q=make:a*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'


Is that a bug, or is there something I'm missing?

thanks in advance, Chuck

On Feb 13, 9:33 am, Ivan Brusic <[hidden email]> wrote:

> Hi Chuck,
>
> When faceting on strings, they should either be not analyzed
> (preferred) or tokenized with a KeywordTokenizer. What is happening in
> yourcaseis the terms are being indexed as lowercase by the default
> analyzer.
>
> --
> Ivan
>
>
>
>
>
>
>
> On Mon, Feb 13, 2012 at 9:14 AM, csh <[hidden email]> wrote:
> > Is there a way to dofacetedsearches using the Search API AND
> > maintaincase.  For example...
>
> > curl -X POST "http://localhost:9200/automobiles/automobile/_search?
> > pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
> > { "terms" : {"field" : "make"} }}}'
>
> > ...returns...
>
> > {
> >  ...
> >  "facets" : {
> >    "make" : {
> >      ...
> >      "terms" : [ {
> >        "term" : "bmw",
> >        "count" : 1654
> >      }, {
> >        "term" : "buick",
> >        "count" : 362
> >      }, {
> >      ...
> >      } ]
> >    }
> >  }
>
> > ...but I want to retain thecase("BMW", "Buick").
>
> > Thanks in advance, Chuck
Reply | Threaded
Open this post in threaded view
|

Re: retaining case in a faceted search

csh-2
Got it!  Need to put the wildcard directive in explicitly:

curl -X POST "http://localhost:9200/cars/car/_search?pretty=true" -d
'{"size" : "0", "query": {"wildcard" : { "make" : "A*" }}, "facets" :
{"make" : { "terms" : {"field" : "make"} }}}'

And, as expected, the wildcard is case-sensitive...

thanks, Chuck

On Feb 14, 8:58 am, csh <[hidden email]> wrote:

> I'm not quite getting the results I expect:  I think I'm indexing the
> way you suggested...
>
> curl -XPUT localhost:9200/cars?pretty=true -d '{"index" :
> {"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}'
>
> ...because after populating ES, the following query gives me the fully-
> retained fields:
>
> curl -X POST "http://localhost:9200/cars/car/_search?
> pretty=true&q=make:*" -d '{"size" : "0", "facets" :  {"make" :
> { "terms" : {"field" : "make"} }}}'
>
> I can even do a query now in which I ask for all "makes" that end in
> "n"...
>
> curl -X POST "http://localhost:9200/cars/car/_search?
> pretty=true&q=make:*n" -d '{"size" : "0", "facets" :  {"make" :
> { "terms" : {"field" : "make"} }}}'
>
> ...and I get the right result:
>
> {
>   ...
>   "facets" : {
>     "make" : {
>       "_type" : "terms",
>       "missing" : 0,
>       "total" : 3,
>       "other" : 0,
>       "terms" : [ {
>         "term" : "Aston Martin",
>         "count" : 2
>       }, {
>         "term" : "Nissan",
>         "count" : 1
>       } ]
>     }
>   }
>
> }
>
> However, if I ask for all "makes" that start with "a" or
> "A" (q=make:A* or q=make:a*), I get no results (there should be
> several--at least one as shown in the above example):
>
> curl -X POST "http://localhost:9200/cars/car/_search?
> pretty=true&q=make:a*" -d '{"size" : "0", "facets" :  {"make" :
> { "terms" : {"field" : "make"} }}}'
>
> Is that a bug, or is there something I'm missing?
>
> thanks in advance, Chuck
>
> On Feb 13, 9:33 am, Ivan Brusic <[hidden email]> wrote:
>
>
>
>
>
>
>
> > Hi Chuck,
>
> > When faceting on strings, they should either be not analyzed
> > (preferred) or tokenized with a KeywordTokenizer. What is happening in
> > yourcaseis the terms are being indexed as lowercase by the default
> > analyzer.
>
> > --
> > Ivan
>
> > On Mon, Feb 13, 2012 at 9:14 AM, csh <[hidden email]> wrote:
> > > Is there a way to dofacetedsearches using the Search API AND
> > > maintaincase.  For example...
>
> > > curl -X POST "http://localhost:9200/automobiles/automobile/_search?
> > > pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
> > > { "terms" : {"field" : "make"} }}}'
>
> > > ...returns...
>
> > > {
> > >  ...
> > >  "facets" : {
> > >    "make" : {
> > >      ...
> > >      "terms" : [ {
> > >        "term" : "bmw",
> > >        "count" : 1654
> > >      }, {
> > >        "term" : "buick",
> > >        "count" : 362
> > >      }, {
> > >      ...
> > >      } ]
> > >    }
> > >  }
>
> > > ...but I want to retain thecase("BMW", "Buick").
>
> > > Thanks in advance, Chuck
Reply | Threaded
Open this post in threaded view
|

Re: retaining case in a faceted search

kimchy
Administrator
Note, what you are doing is storing all text fields using the keyword analyzer, I am not sure that its what you really want. Only use that on fields that you want to facet, possibly with multi field mapping.

On Tuesday, February 14, 2012 at 10:34 PM, csh wrote:

Got it! Need to put the wildcard directive in explicitly:

'{"size" : "0", "query": {"wildcard" : { "make" : "A*" }}, "facets" :
{"make" : { "terms" : {"field" : "make"} }}}'

And, as expected, the wildcard is case-sensitive...

thanks, Chuck

On Feb 14, 8:58 am, csh <chuck....@gmail.com> wrote:
I'm not quite getting the results I expect:  I think I'm indexing the
way you suggested...

curl -XPUT localhost:9200/cars?pretty=true -d '{"index" :
{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}'

...because after populating ES, the following query gives me the fully-
retained fields:

pretty=true&q=make:*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

I can even do a query now in which I ask for all "makes" that end in
"n"...

pretty=true&q=make:*n" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

...and I get the right result:

{
  ...
  "facets" : {
    "make" : {
      "_type" : "terms",
      "missing" : 0,
      "total" : 3,
      "other" : 0,
      "terms" : [ {
        "term" : "Aston Martin",
        "count" : 2
      }, {
        "term" : "Nissan",
        "count" : 1
      } ]
    }
  }

}

However, if I ask for all "makes" that start with "a" or
"A" (q=make:A* or q=make:a*), I get no results (there should be
several--at least one as shown in the above example):

pretty=true&q=make:a*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

Is that a bug, or is there something I'm missing?

thanks in advance, Chuck

On Feb 13, 9:33 am, Ivan Brusic <i...@brusic.com> wrote:







Hi Chuck,

When faceting on strings, they should either be not analyzed
(preferred) or tokenized with a KeywordTokenizer. What is happening in
yourcaseis the terms are being indexed as lowercase by the default
analyzer.

--
Ivan

On Mon, Feb 13, 2012 at 9:14 AM, csh <chuck....@gmail.com> wrote:
Is there a way to dofacetedsearches using the Search API AND
maintaincase.  For example...

pretty=true&q=make:B*" -d '{"size" : "0", "facets" :  {"make" :
{ "terms" : {"field" : "make"} }}}'

...returns...

{
 ...
 "facets" : {
   "make" : {
     ...
     "terms" : [ {
       "term" : "bmw",
       "count" : 1654
     }, {
       "term" : "buick",
       "count" : 362
     }, {
     ...
     } ]
   }
 }

...but I want to retain thecase("BMW", "Buick").

Thanks in advance, Chuck