Not able to see whether data stemmed or not!

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Not able to see whether data stemmed or not!

Panzer
I set up an index using those settings :

{
                "settings": {
                    "index": {
                        "type": "default"
                    },
                    "number_of_shards": 1,
                    "number_of_replicas": 0,
                "analysis": {
                    "filter": {
                        "stopper": {
                            "type": "stop",
                            "stopwords": "_english_"
                        },
                        "stemmer_light": {
                            "type": "stemmer",
                            "name": "light_english"
                        },
                        "stemmer_possessive": {
                            "type": "stemmer",
                            "name": "possessive_english"
                        }
                    },
                    "analyzer": {
                        "new_analyzer" : {
                            "type": "custom",
                            "tokenizer": "whitespace",
                            "filter": ["stopper",
                                        "lowercase",
                                        "whitespace",
                                        "stemmer_light",
                                        "stemmer_possessive"]
                        }
                    }
                }
            }
        }

After I am done with indexing I run these:
GET /index1/document/1101/_source

GET /index1/_mtermvectors/
{
   "docs": [
      {
         "_type": "news",
         "_id": "1101",
         "fields": ["text"],
         "term_statistics": true
      }
   ]
}


None of these show me stemmed words and moreover all still have the stop words in them!!! What am I doing wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f5d6a6df-ca79-427a-9b1a-74fca04e8b40%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Not able to see whether data stemmed or not!

dadoonet
I can't see any mapping set here. Could you GIST a full example?

May be you just forgot to use the analyzer you defined?

David

Le 26 janv. 2015 à 04:11, Panzer <[hidden email]> a écrit :

I set up an index using those settings :

{
                "settings": {
                    "index": {
                        "type": "default"
                    },
                    "number_of_shards": 1,
                    "number_of_replicas": 0,
                "analysis": {
                    "filter": {
                        "stopper": {
                            "type": "stop",
                            "stopwords": "_english_"
                        },
                        "stemmer_light": {
                            "type": "stemmer",
                            "name": "light_english"
                        },
                        "stemmer_possessive": {
                            "type": "stemmer",
                            "name": "possessive_english"
                        }
                    },
                    "analyzer": {
                        "new_analyzer" : {
                            "type": "custom",
                            "tokenizer": "whitespace",
                            "filter": ["stopper",
                                        "lowercase",
                                        "whitespace",
                                        "stemmer_light",
                                        "stemmer_possessive"]
                        }
                    }
                }
            }
        }

After I am done with indexing I run these:
GET /index1/document/1101/_source

GET /index1/_mtermvectors/
{
   "docs": [
      {
         "_type": "news",
         "_id": "1101",
         "fields": ["text"],
         "term_statistics": true
      }
   ]
}


None of these show me stemmed words and moreover all still have the stop words in them!!! What am I doing wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f5d6a6df-ca79-427a-9b1a-74fca04e8b40%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3E9F9B05-2114-4FF2-9A1F-4725F49AB566%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Not able to see whether data stemmed or not!

Panzer
"document": {
            "properties": {
                "text": {
                    "type": "string",
                    "store": true,
                    "index": "analyzed",
                    "term_vector": "with_positions_offsets_payloads",
                    "filter": "stopper",
                    "analyzer": "new_analyzer"
                },
                "doc_length": {
                    "type": "long",
                    "store": true,
                    "index": "not_analyzed"
                }
            }
        }

This was the mapping I was using.

On Sunday, January 25, 2015 at 11:42:14 PM UTC-5, David Pilato wrote:
I can't see any mapping set here. Could you GIST a full example?

May be you just forgot to use the analyzer you defined?

David

Le 26 janv. 2015 à 04:11, Panzer <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="H1hJmXUKINkJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">appyth...@...> a écrit :

I set up an index using those settings :

{
                "settings": {
                    "index": {
                        "type": "default"
                    },
                    "number_of_shards": 1,
                    "number_of_replicas": 0,
                "analysis": {
                    "filter": {
                        "stopper": {
                            "type": "stop",
                            "stopwords": "_english_"
                        },
                        "stemmer_light": {
                            "type": "stemmer",
                            "name": "light_english"
                        },
                        "stemmer_possessive": {
                            "type": "stemmer",
                            "name": "possessive_english"
                        }
                    },
                    "analyzer": {
                        "new_analyzer" : {
                            "type": "custom",
                            "tokenizer": "whitespace",
                            "filter": ["stopper",
                                        "lowercase",
                                        "whitespace",
                                        "stemmer_light",
                                        "stemmer_possessive"]
                        }
                    }
                }
            }
        }

After I am done with indexing I run these:
GET /index1/document/1101/_source

GET /index1/_mtermvectors/
{
   "docs": [
      {
         "_type": "news",
         "_id": "1101",
         "fields": ["text"],
         "term_statistics": true
      }
   ]
}


None of these show me stemmed words and moreover all still have the stop words in them!!! What am I doing wrong?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="H1hJmXUKINkJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/f5d6a6df-ca79-427a-9b1a-74fca04e8b40%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/f5d6a6df-ca79-427a-9b1a-74fca04e8b40%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/f5d6a6df-ca79-427a-9b1a-74fca04e8b40%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/f5d6a6df-ca79-427a-9b1a-74fca04e8b40%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d202e963-fe2a-4d8f-a699-a80e610fa20f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.