Synonym Filter

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Synonym Filter

paul
Hi,

My Synonym file contains the entry as below

MIT,Massachusetts Institute of Technology

My setting is as below:

   "settings":{
      "analysis":{
         "analyzer":{
            "synonym":{
               "tokenizer":"my_pipe_analyzer",
               "filter":[
                  "lowercase",
                  "syns_filter"
               ]
            },
            "my_pipe_analyzer":{
               "tokenizer":"my_pipe_analyzer"
            },
   "autocomplete_search":{
               "type":"custom",
      "tokenizer":"my_pipe_analyzer",
               "filter":[
                  "lowercase",
                  "syns_filter",
 "stop"
               ]
            }
         },
         "tokenizer":{
            "my_pipe_analyzer":{
               "type":"pattern",
               "pattern":"\\|"
            }
         },
         "filter":{
            "syns_filter":{
               "synonyms_path":"synonyms/synonym_collegename.txt",
               "type":"synonym",
      "ignore_case":true
            }
         }
      }
   }

I have created a pipe separated tokanizer so that the synonyms are not split on spaces still it is getting split on spaces when i verify it with the analyze API , below is my output from 
analyzer api.

{
   "tokens":[
      {
         "token":"mit",
         "start_offset":0,
         "end_offset":3,
         "type":"SYNONYM",
         "position":1
      },
      {
         "token":"massachusetts",
         "start_offset":0,
         "end_offset":3,
         "type":"SYNONYM",
         "position":1
      },
      {
         "token":"institute",
         "start_offset":0,
         "end_offset":3,
         "type":"SYNONYM",
         "position":2
      },
      {
         "token":"technology",
         "start_offset":0,
         "end_offset":3,
         "type":"SYNONYM",
         "position":4
      }
   ]
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7516d1a7-72d0-4b3f-b426-deb80b8d6450%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|

Re: Synonym Filter

sina.tamanna
Hey,

Synonym filter has its own tokenizer which is not the same as one defined for synonym analyzer. You need to define the tokenizer inside the synonym filter:

"filter":{
            "syns_filter":{
               "synonyms_path":"synonyms/synonym_collegename.txt",
               "type":"synonym",
               "tokenizer":"keyword",
       "ignore_case":true
            }
         }

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3390e6e4-5f0e-448a-bcc2-e3385200731b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.