Question about multi_field and edge ngram

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Question about multi_field and edge ngram

quain
I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '
{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

kimchy
Administrator
Can you gist a sample with setting up the index, the mappings, indexing sample doc(s) and then showing the search request that do not work for you? It would help a lot. (see http://www.elasticsearch.org/help).

On Monday, March 5, 2012 at 9:07 AM, quain wrote:

I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong. After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram. I am successful with the main field just searching on
name. For example, the following returns nothing even though i know
the term is there exactly:

{
"query" : {
"text" : {
"name.ngram" : {
"query" : "xyz"
}
}
}
}
'

Here's my mapping:

{
"mappings" : {
"member" : {
"properties" : {
"name" : {
"type": "multi_field",
"fields" : {
"untouched": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string",
"analyzer": "standard_name"
},
"ngram" : {
"search_analyzer": "standard_name",
"index_analyzer": "partial_name",
"type": "string"
}
}
}
}
}
},
"settings" : {
"analysis" : {
"filter" : {
"name_ngrams" : {
"side" : "front",
"max_gram" : 10,
"min_gram" : 1,
"type" : "edgeNGram"
}
},
"analyzer" : {
"standard_name" : {
"filter" : [
"standard",
"lowercase",
"asciifolding"
],
"type" : "custom",
"tokenizer" : "standard"
},
"partial_name" : {
"filter" : [
"standard",
"lowercase",
"asciifolding",
"name_ngrams"
],
"type" : "custom",
"tokenizer" : "standard"
}
}
}
}
}
'

Can anyone give me some help to identify what might be wrong?

Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

quain
https://gist.github.com/1979125

Here is the gist.  Thanks very much for any help that can be
provided.  I'm mostly interested in if I'm setting up the mapping
incorrectly, or using the multi_field or ngram types incorrectly.

On Mar 5, 7:19 am, Shay Banon <[hidden email]> wrote:

> Can you gist a sample with setting up the index, the mappings, indexing sample doc(s) and then showing the search request that do not work for you? It would help a lot. (seehttp://www.elasticsearch.org/help).
>
>
>
>
>
>
>
> On Monday, March 5, 2012 at 9:07 AM, quain wrote:
> > I'm having some trouble with multi_field, perhaps some of you guys
> > could shed some light on what I'm doing wrong. After I inserted some
> > documents, I get 0 hits for any searches of the field name.untouched
> > or name.ngram. I am successful with the main field just searching on
> > name. For example, the following returns nothing even though i know
> > the term is there exactly:
>
> > curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'-d '
> > {
> > "query" : {
> > "text" : {
> > "name.ngram" : {
> > "query" : "xyz"
> > }
> > }
> > }
> > }
> > '
>
> > Here's my mapping:
>
> > curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'-d '
> > {
> > "mappings" : {
> > "member" : {
> > "properties" : {
> > "name" : {
> > "type": "multi_field",
> > "fields" : {
> > "untouched": {
> > "type": "string",
> > "index": "not_analyzed"
> > },
> > "name": {
> > "type": "string",
> > "analyzer": "standard_name"
> > },
> > "ngram" : {
> > "search_analyzer": "standard_name",
> > "index_analyzer": "partial_name",
> > "type": "string"
> > }
> > }
> > }
> > }
> > }
> > },
> > "settings" : {
> > "analysis" : {
> > "filter" : {
> > "name_ngrams" : {
> > "side" : "front",
> > "max_gram" : 10,
> > "min_gram" : 1,
> > "type" : "edgeNGram"
> > }
> > },
> > "analyzer" : {
> > "standard_name" : {
> > "filter" : [
> > "standard",
> > "lowercase",
> > "asciifolding"
> > ],
> > "type" : "custom",
> > "tokenizer" : "standard"
> > },
> > "partial_name" : {
> > "filter" : [
> > "standard",
> > "lowercase",
> > "asciifolding",
> > "name_ngrams"
> > ],
> > "type" : "custom",
> > "tokenizer" : "standard"
> > }
> > }
> > }
> > }
> > }
> > '
>
> > Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

Garrick Evans
In reply to this post by quain
What if you

"query":{"text":{"name.ngram":"xyz"}}}

?


On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '
{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

quain
Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <[hidden email]> wrote:

> What if you
>
> "query":{"text":{"name.ngram":"xyz"}}}
>
> ?
>
>
>
>
>
>
>
> On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
>
> > I'm having some trouble with multi_field, perhaps some of you guys
> > could shed some light on what I'm doing wrong.  After I inserted some
> > documents, I get 0 hits for any searches of the field name.untouched
> > or name.ngram.  I am successful with the main field just searching on
> > name.  For example, the following returns nothing even though i know
> > the term is there exactly:
>
> > curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1' -d '
> > {
> >    "query" : {
> >       "text" : {
> >          "name.ngram" : {
> >             "query" : "xyz"
> >          }
> >       }
> >    }
> > }
> > '
>
> > Here's my mapping:
>
> > curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
> > {
> >    "mappings" : {
> >       "member" : {
> >          "properties" : {
> >             "name" : {
> >                "type": "multi_field",
> >                "fields" : {
> >                   "untouched": {
> >                      "type": "string",
> >                      "index": "not_analyzed"
> >                   },
> >                   "name": {
> >                      "type": "string",
> >                      "analyzer": "standard_name"
> >                   },
> >                   "ngram" : {
> >                      "search_analyzer": "standard_name",
> >                      "index_analyzer": "partial_name",
> >                      "type": "string"
> >                   }
> >                }
> >             }
> >          }
> >       }
> >    },
> >    "settings" : {
> >       "analysis" : {
> >          "filter" : {
> >             "name_ngrams" : {
> >                "side" : "front",
> >                "max_gram" : 10,
> >                "min_gram" : 1,
> >                "type" : "edgeNGram"
> >             }
> >          },
> >          "analyzer" : {
> >             "standard_name" : {
> >                "filter" : [
> >                   "standard",
> >                   "lowercase",
> >                   "asciifolding"
> >                ],
> >                "type" : "custom",
> >                "tokenizer" : "standard"
> >             },
> >             "partial_name" : {
> >                "filter" : [
> >                   "standard",
> >                   "lowercase",
> >                   "asciifolding",
> >                   "name_ngrams"
> >                ],
> >                "type" : "custom",
> >                "tokenizer" : "standard"
> >             }
> >          }
> >       }
> >    }
> > }
> > '
>
> > Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

Garrick Evans
You are posting to a "test" type but you created the mapping for "member".  Try:

curl - POST 'http://localhost:9200/test/member?pretty=true' -d '{ "name" : "The Office" }'
curl - POST 'http://localhost:9200/test/member?pretty=true' -d '{ "name" : "The Office (UK)" }'

You should then be able to query like:

curl -XGET http://localhost:9200/test/_search?pretty=1 -d '{"query":{"text":{"name.ngram":"th"}}}'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.25,
    "hits" : [ {
      "_index" : "test",
      "_type" : "member",
      "_id" : "MLleoK0pR8eBGhY9pkm0ig",
      "_score" : 0.25, "_source" : { "name" : "The Office (UK)" }
    }, {
      "_index" : "test",
      "_type" : "member",
      "_id" : "nzY1KZ6mRuqrzxosJBi9IQ",
      "_score" : 0.095891505, "_source" : { "name" : "The Office" }
    } ]
  }


On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:
Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <[hidden email]> wrote:

> What if you
>
> "query":{"text":{"name.ngram":"xyz"}}}
>
> ?
>
>
>
>
>
>
>
> On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
>
> > I'm having some trouble with multi_field, perhaps some of you guys
> > could shed some light on what I'm doing wrong.  After I inserted some
> > documents, I get 0 hits for any searches of the field name.untouched
> > or name.ngram.  I am successful with the main field just searching on
> > name.  For example, the following returns nothing even though i know
> > the term is there exactly:
>
> > curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1' -d '
> > {
> >    "query" : {
> >       "text" : {
> >          "name.ngram" : {
> >             "query" : "xyz"
> >          }
> >       }
> >    }
> > }
> > '
>
> > Here's my mapping:
>
> > curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
> > {
> >    "mappings" : {
> >       "member" : {
> >          "properties" : {
> >             "name" : {
> >                "type": "multi_field",
> >                "fields" : {
> >                   "untouched": {
> >                      "type": "string",
> >                      "index": "not_analyzed"
> >                   },
> >                   "name": {
> >                      "type": "string",
> >                      "analyzer": "standard_name"
> >                   },
> >                   "ngram" : {
> >                      "search_analyzer": "standard_name",
> >                      "index_analyzer": "partial_name",
> >                      "type": "string"
> >                   }
> >                }
> >             }
> >          }
> >       }
> >    },
> >    "settings" : {
> >       "analysis" : {
> >          "filter" : {
> >             "name_ngrams" : {
> >                "side" : "front",
> >                "max_gram" : 10,
> >                "min_gram" : 1,
> >                "type" : "edgeNGram"
> >             }
> >          },
> >          "analyzer" : {
> >             "standard_name" : {
> >                "filter" : [
> >                   "standard",
> >                   "lowercase",
> >                   "asciifolding"
> >                ],
> >                "type" : "custom",
> >                "tokenizer" : "standard"
> >             },
> >             "partial_name" : {
> >                "filter" : [
> >                   "standard",
> >                   "lowercase",
> >                   "asciifolding",
> >                   "name_ngrams"
> >                ],
> >                "type" : "custom",
> >                "tokenizer" : "standard"
> >             }
> >          }
> >       }
> >    }
> > }
> > '
>
> > Can anyone give me some help to identify what might be wrong?
On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:
Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <[hidden email]> wrote:

> What if you
>
> "query":{"text":{"name.ngram":"xyz"}}}
>
> ?
>
>
>
>
>
>
>
> On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
>
> > I'm having some trouble with multi_field, perhaps some of you guys
> > could shed some light on what I'm doing wrong.  After I inserted some
> > documents, I get 0 hits for any searches of the field name.untouched
> > or name.ngram.  I am successful with the main field just searching on
> > name.  For example, the following returns nothing even though i know
> > the term is there exactly:
>
> > curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1' -d '
> > {
> >    "query" : {
> >       "text" : {
> >          "name.ngram" : {
> >             "query" : "xyz"
> >          }
> >       }
> >    }
> > }
> > '
>
> > Here's my mapping:
>
> > curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1' -d '
> > {
> >    "mappings" : {
> >       "member" : {
> >          "properties" : {
> >             "name" : {
> >                "type": "multi_field",
> >                "fields" : {
> >                   "untouched": {
> >                      "type": "string",
> >                      "index": "not_analyzed"
> >                   },
> >                   "name": {
> >                      "type": "string",
> >                      "analyzer": "standard_name"
> >                   },
> >                   "ngram" : {
> >                      "search_analyzer": "standard_name",
> >                      "index_analyzer": "partial_name",
> >                      "type": "string"
> >                   }
> >                }
> >             }
> >          }
> >       }
> >    },
> >    "settings" : {
> >       "analysis" : {
> >          "filter" : {
> >             "name_ngrams" : {
> >                "side" : "front",
> >                "max_gram" : 10,
> >                "min_gram" : 1,
> >                "type" : "edgeNGram"
> >             }
> >          },
> >          "analyzer" : {
> >             "standard_name" : {
> >                "filter" : [
> >                   "standard",
> >                   "lowercase",
> >                   "asciifolding"
> >                ],
> >                "type" : "custom",
> >                "tokenizer" : "standard"
> >             },
> >             "partial_name" : {
> >                "filter" : [
> >                   "standard",
> >                   "lowercase",
> >                   "asciifolding",
> >                   "name_ngrams"
> >                ],
> >                "type" : "custom",
> >                "tokenizer" : "standard"
> >             }
> >          }
> >       }
> >    }
> > }
> > '
>
> > Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

quain
Thanks a lot for pointing this out.  This was the problem.

I was wondering, is it possible to apply this mapping to all types
within an index?  For example, if I wanted to set the mapping for the
"name" field for all types of the index "test", can I do that?

On Mar 5, 12:33 pm, Garrick Evans <[hidden email]> wrote:

> You are posting to a "test" type but you created the mapping for "member".
>  Try:
>
>
>
> > curl - POST 'http://localhost:9200/test/member?pretty=true'-d '{ "name" :
> > "The Office" }'
> > curl - POST 'http://localhost:9200/test/member?pretty=true'-d '{ "name" :
> > "The Office (UK)" }'
>
> You should then be able to query like:
>
>
>
>
>
>
>
>
>
> > curl -XGEThttp://localhost:9200/test/_search?pretty=1-d
> > '{"query":{"text":{"name.ngram":"th"}}}'
> > {
> >   "took" : 1,
> >   "timed_out" : false,
> >   "_shards" : {
> >     "total" : 5,
> >     "successful" : 5,
> >     "failed" : 0
> >   },
> >   "hits" : {
> >     "total" : 2,
> >     "max_score" : 0.25,
> >     "hits" : [ {
> >       "_index" : "test",
> >       "_type" : "member",
> >       "_id" : "MLleoK0pR8eBGhY9pkm0ig",
> >       "_score" : 0.25, "_source" : { "name" : "The Office (UK)" }
> >     }, {
> >       "_index" : "test",
> >       "_type" : "member",
> >       "_id" : "nzY1KZ6mRuqrzxosJBi9IQ",
> >       "_score" : 0.095891505, "_source" : { "name" : "The Office" }
> >     } ]
> >   }
> On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:
>
> > Thanks, but I got the same result, still 0 hits.  I didn't get a
> > syntax error the first time either, so not sure this is the problem.
>
> > I thought I pretty much followed the documentation on this, not sure
> > why it doesn't work.
>
> > On Mar 5, 10:35 am, Garrick Evans <[hidden email]> wrote:
> > > What if you
>
> > > "query":{"text":{"name.ngram":"xyz"}}}
>
> > > ?
>
> > > On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
>
> > > > I'm having some trouble with multi_field, perhaps some of you guys
> > > > could shed some light on what I'm doing wrong.  After I inserted some
> > > > documents, I get 0 hits for any searches of the field name.untouched
> > > > or name.ngram.  I am successful with the main field just searching on
> > > > name.  For example, the following returns nothing even though i know
> > > > the term is there exactly:
>
> > > > curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'-d '
> > > > {
> > > >    "query" : {
> > > >       "text" : {
> > > >          "name.ngram" : {
> > > >             "query" : "xyz"
> > > >          }
> > > >       }
> > > >    }
> > > > }
> > > > '
>
> > > > Here's my mapping:
>
> > > > curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'-d '
> > > > {
> > > >    "mappings" : {
> > > >       "member" : {
> > > >          "properties" : {
> > > >             "name" : {
> > > >                "type": "multi_field",
> > > >                "fields" : {
> > > >                   "untouched": {
> > > >                      "type": "string",
> > > >                      "index": "not_analyzed"
> > > >                   },
> > > >                   "name": {
> > > >                      "type": "string",
> > > >                      "analyzer": "standard_name"
> > > >                   },
> > > >                   "ngram" : {
> > > >                      "search_analyzer": "standard_name",
> > > >                      "index_analyzer": "partial_name",
> > > >                      "type": "string"
> > > >                   }
> > > >                }
> > > >             }
> > > >          }
> > > >       }
> > > >    },
> > > >    "settings" : {
> > > >       "analysis" : {
> > > >          "filter" : {
> > > >             "name_ngrams" : {
> > > >                "side" : "front",
> > > >                "max_gram" : 10,
> > > >                "min_gram" : 1,
> > > >                "type" : "edgeNGram"
> > > >             }
> > > >          },
> > > >          "analyzer" : {
> > > >             "standard_name" : {
> > > >                "filter" : [
> > > >                   "standard",
> > > >                   "lowercase",
> > > >                   "asciifolding"
> > > >                ],
> > > >                "type" : "custom",
> > > >                "tokenizer" : "standard"
> > > >             },
> > > >             "partial_name" : {
> > > >                "filter" : [
> > > >                   "standard",
> > > >                   "lowercase",
> > > >                   "asciifolding",
> > > >                   "name_ngrams"
> > > >                ],
> > > >                "type" : "custom",
> > > >                "tokenizer" : "standard"
> > > >             }
> > > >          }
> > > >       }
> > > >    }
> > > > }
> > > > '
>
> > > > Can anyone give me some help to identify what might be wrong?
> On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:
>
> > Thanks, but I got the same result, still 0 hits.  I didn't get a
> > syntax error the first time either, so not sure this is the problem.
>
> > I thought I pretty much followed the documentation on this, not sure
> > why it doesn't work.
>
> > On Mar 5, 10:35 am, Garrick Evans <[hidden email]> wrote:
> > > What if you
>
> > > "query":{"text":{"name.ngram":"xyz"}}}
>
> > > ?
>
> > > On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
>
> > > > I'm having some trouble with multi_field, perhaps some of you guys
> > > > could shed some light on what I'm doing wrong.  After I inserted some
> > > > documents, I get 0 hits for any searches of the field name.untouched
> > > > or name.ngram.  I am successful with the main field just searching on
> > > > name.  For example, the following returns nothing even though i know
> > > > the term is there exactly:
>
> > > > curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'-d '
> > > > {
> > > >    "query" : {
> > > >       "text" : {
> > > >          "name.ngram" : {
> > > >             "query" : "xyz"
> > > >          }
> > > >       }
> > > >    }
> > > > }
> > > > '
>
> > > > Here's my mapping:
>
> > > > curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'-d '
> > > > {
> > > >    "mappings" : {
> > > >       "member" : {
> > > >          "properties" : {
> > > >             "name" : {
> > > >                "type": "multi_field",
> > > >                "fields" : {
> > > >                   "untouched": {
> > > >                      "type": "string",
> > > >                      "index": "not_analyzed"
> > > >                   },
> > > >                   "name": {
> > > >                      "type": "string",
> > > >                      "analyzer": "standard_name"
> > > >                   },
> > > >                   "ngram" : {
> > > >                      "search_analyzer": "standard_name",
> > > >                      "index_analyzer": "partial_name",
> > > >                      "type": "string"
> > > >                   }
> > > >                }
> > > >             }
> > > >          }
> > > >       }
> > > >    },
> > > >    "settings" : {
> > > >       "analysis" : {
> > > >          "filter" : {
> > > >             "name_ngrams" : {
> > > >                "side" : "front",
> > > >                "max_gram" : 10,
> > > >                "min_gram" : 1,
> > > >                "type" : "edgeNGram"
> > > >             }
> > > >          },
> > > >          "analyzer" : {
> > > >             "standard_name" : {
> > > >                "filter" : [
> > > >                   "standard",
> > > >                   "lowercase",
> > > >                   "asciifolding"
> > > >                ],
> > > >                "type" : "custom",
> > > >                "tokenizer" : "standard"
> > > >             },
> > > >             "partial_name" : {
> > > >                "filter" : [
> > > >                   "standard",
> > > >                   "lowercase",
> > > >                   "asciifolding",
> > > >                   "name_ngrams"
> > > >                ],
> > > >                "type" : "custom",
> > > >                "tokenizer" : "standard"
> > > >             }
> > > >          }
> > > >       }
> > > >    }
> > > > }
> > > > '
>
> > > > Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

Matt Weber
I think you would use the default mapping:



-- 
Matt Weber
Sent with Sparrow

On Monday, March 5, 2012 at 3:35 PM, quain wrote:

Thanks a lot for pointing this out. This was the problem.

I was wondering, is it possible to apply this mapping to all types
within an index? For example, if I wanted to set the mapping for the
"name" field for all types of the index "test", can I do that?

On Mar 5, 12:33 pm, Garrick Evans <buk...@gmail.com> wrote:
You are posting to a "test" type but you created the mapping for "member".
 Try:



"The Office" }'
"The Office (UK)" }'

You should then be able to query like:









curl -XGEThttp://localhost:9200/test/_search?pretty=1-d
'{"query":{"text":{"name.ngram":"th"}}}'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.25,
    "hits" : [ {
      "_index" : "test",
      "_type" : "member",
      "_id" : "MLleoK0pR8eBGhY9pkm0ig",
      "_score" : 0.25, "_source" : { "name" : "The Office (UK)" }
    }, {
      "_index" : "test",
      "_type" : "member",
      "_id" : "nzY1KZ6mRuqrzxosJBi9IQ",
      "_score" : 0.095891505, "_source" : { "name" : "The Office" }
    } ]
  }
On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:

Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <buk...@gmail.com> wrote:
What if you

"query":{"text":{"name.ngram":"xyz"}}}

?

On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:

I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?
On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:

Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <buk...@gmail.com> wrote:
What if you

"query":{"text":{"name.ngram":"xyz"}}}

?

On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:

I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?

Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

kimchy
Administrator
Another option is to use dynamic templates: http://www.elasticsearch.org/guide/reference/mapping/root-object-type.html, and set it on a mapping called _default_ (which will apply to all types in the index).

On Tuesday, March 6, 2012 at 2:28 AM, Matt Weber wrote:

I think you would use the default mapping:



-- 
Matt Weber
Sent with Sparrow

On Monday, March 5, 2012 at 3:35 PM, quain wrote:

Thanks a lot for pointing this out. This was the problem.

I was wondering, is it possible to apply this mapping to all types
within an index? For example, if I wanted to set the mapping for the
"name" field for all types of the index "test", can I do that?

On Mar 5, 12:33 pm, Garrick Evans <buk...@gmail.com> wrote:
You are posting to a "test" type but you created the mapping for "member".
 Try:



"The Office" }'
"The Office (UK)" }'

You should then be able to query like:









curl -XGEThttp://localhost:9200/test/_search?pretty=1-d
'{"query":{"text":{"name.ngram":"th"}}}'
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.25,
    "hits" : [ {
      "_index" : "test",
      "_type" : "member",
      "_id" : "MLleoK0pR8eBGhY9pkm0ig",
      "_score" : 0.25, "_source" : { "name" : "The Office (UK)" }
    }, {
      "_index" : "test",
      "_type" : "member",
      "_id" : "nzY1KZ6mRuqrzxosJBi9IQ",
      "_score" : 0.095891505, "_source" : { "name" : "The Office" }
    } ]
  }
On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:

Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <buk...@gmail.com> wrote:
What if you

"query":{"text":{"name.ngram":"xyz"}}}

?

On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:

I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?
On Monday, March 5, 2012 11:10:51 AM UTC-8, quain wrote:

Thanks, but I got the same result, still 0 hits.  I didn't get a
syntax error the first time either, so not sure this is the problem.

I thought I pretty much followed the documentation on this, not sure
why it doesn't work.

On Mar 5, 10:35 am, Garrick Evans <buk...@gmail.com> wrote:
What if you

"query":{"text":{"name.ngram":"xyz"}}}

?

On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:

I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?


Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

quain
In reply to this post by quain
Bring back this old thread, but I have a followup question.  Does edge ngram filtered fields support prefix or phrase-prefix queries?  My goal is to prefer results which have the same word ordering as the query, ie if I search "Jack John" prefer "Jack Johnson" over "John Jackson".  Can this be done via a query type/parameter, or custom scoring?

On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '
{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?
Reply | Threaded
Open this post in threaded view
|

Re: Question about multi_field and edge ngram

kimchy
Administrator
You can execute a phrase query and boost it.

On Thu, Apr 12, 2012 at 4:22 AM, quain <[hidden email]> wrote:
Bring back this old thread, but I have a followup question.  Does edge ngram filtered fields support prefix or phrase-prefix queries?  My goal is to prefer results which have the same word ordering as the query, ie if I search "Jack John" prefer "Jack Johnson" over "John Jackson".  Can this be done via a query type/parameter, or custom scoring?


On Sunday, March 4, 2012 11:07:04 PM UTC-8, quain wrote:
I'm having some trouble with multi_field, perhaps some of you guys
could shed some light on what I'm doing wrong.  After I inserted some
documents, I get 0 hits for any searches of the field name.untouched
or name.ngram.  I am successful with the main field just searching on
name.  For example, the following returns nothing even though i know
the term is there exactly:

curl -XGET 'http://127.0.0.1:9200/test/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "name.ngram" : {
            "query" : "xyz"
         }
      }
   }
}
'

Here's my mapping:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '
{
   "mappings" : {
      "member" : {
         "properties" : {
            "name" : {
               "type": "multi_field",
               "fields" : {
                  "untouched": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "name": {
                     "type": "string",
                     "analyzer": "standard_name"
                  },
                  "ngram" : {
                     "search_analyzer": "standard_name",
                     "index_analyzer": "partial_name",
                     "type": "string"
                  }
               }
            }
         }
      }
   },
   "settings" : {
      "analysis" : {
         "filter" : {
            "name_ngrams" : {
               "side" : "front",
               "max_gram" : 10,
               "min_gram" : 1,
               "type" : "edgeNGram"
            }
         },
         "analyzer" : {
            "standard_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            },
            "partial_name" : {
               "filter" : [
                  "standard",
                  "lowercase",
                  "asciifolding",
                  "name_ngrams"
               ],
               "type" : "custom",
               "tokenizer" : "standard"
            }
         }
      }
   }
}
'

Can anyone give me some help to identify what might be wrong?