|
hey all
when I create an index, I register an analyzer to use with a 'tags' field named 'csv', below. settingsBuilder.put( "index.analysis.analyzer.csv.type", "pattern" ); settingsBuilder.put( "index.analysis.analyzer.csv.pattern", "," ); thus, stuffing "a,b,c" into a 'tags' field and making a facet query returns "a","b","c". which is exactly what I want. Except, if the values are "a-b,a-b,a-c", the values are tokenized against both "," and "-"., return on a facet query gives "a", "b", "c". not "a-b", etc.. But not always! If i run a test to stuff a single document and then run a facet query, sometimes the "-" isn't tokenized on, and sometimes it is. I would say 30% of the time the "-" gets parsed out. I've tried the following as well, and get the same random results settingsBuilder.put( "index.analysis.analyzer.csv.type", "custom" ); settingsBuilder.put( "index.analysis.analyzer.csv.tokenizer", "csvPattern" ); settingsBuilder.put( "index.analysis.analyzer.csv.filter", "lowercase" ); settingsBuilder.put( "index.analysis.tokenizer.csvPattern.type", "pattern" ); settingsBuilder.put( "index.analysis.tokenizer.csvPattern.pattern", "," ); FWIW, my mapping of 'tags' to 'csv' does work, just not _consistently_ across invocations of the test. I'm using a dynamic template, defined here { template_tags: { mapping: { store: yes analyzer: csv type: string } match: tags } } thoughts? -- Chris K Wensel [hidden email] http://concurrentinc.com -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Hi Chris,
Not sure why this happens. Maybe your mapping isn't applied on all indices? What you're doing should work, the field value for the field tags should be tokenised by `,`. I created the following gist:
Can you try if your issue still occurs if you perform the indexing / searching the same way I do in this gist?
(I used ES version 0.20.4) Martijn
-- On 10 February 2013 04:10, Chris K Wensel <[hidden email]> wrote: hey all Met vriendelijke groet, Martijn van Groningen You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
we only have one index. three document types, two of which are reciprocating parents, the third is just a child (not nested of course). though all three have nested documents. the tags are not in those nested elements. turns out this has been a persistent problem for the last 9-12 months of ES releases, we just stopped using a "-" in our tests to stop the random failures. I think I just need to go deep and see what's happening internally. I suspect your gist will work without issues with such a simple document. though it may show up if i wrap line 22 with a for loop, since the issue is that the analyzer is randomly not applied properly on a PUT. that is, some puts are properly parsed, some smaller % are not. ckw -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
ok, this does repro the problem note, i had it fail if using doc id 1, and also using a $RANDOM doc id (the last failure below is this) Sometimes it works {"took":44,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"tags","_type":"tag","_id":"1","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }}]},"facets":{"tags":{"_type":"terms","missing":0,"total":2,"other":0,"terms":[{"term":"a-c","count":1},{"term":"a-b","count":1}]}}} query: {"took":40,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"tags","_type":"tag","_id":"1","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }}]},"facets":{"tags":{"_type":"terms","missing":0,"total":2,"other":0,"terms":[{"term":"a-c","count":1},{"term":"a-b","count":1}]}}} Less frequently it does not {"took":42,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"tags","_type":"tag","_id":"1","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }}]},"facets":{"tags":{"_type":"terms","missing":0,"total":2,"other":0,"terms":[{"term":"c","count":1},{"term":"b","count":1}]}}} {"took":57,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":96,"max_score":1.0,"hits":[{"_index":"tags","_type":"tag","_id":"19100","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"26775","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"6971","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"2070","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"17185","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"26016","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"1971","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"2657","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"19504","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }},{"_index":"tags","_type":"tag","_id":"19179","_score":1.0, "_source" : { "tags" : "a-b,a-b,a-c" }}]},"facets":{"tags":{"_type":"terms","missing":0,"total":192,"other":0,"terms":[{"term":"c","count":96},{"term":"b","count":96}]}}} On Feb 15, 2013, at 10:56 AM, Chris K Wensel <[hidden email]> wrote:
-- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
Hey Chris,
this gist doesn't reproduce for me on the current master neither on 0.20.4 for me. What I think can happen here is that the template is not applied when the tags index is created, this would explain what you see. Apparently the "wrong" analyzer is consistently applied to all the documents. Can you try to get this to fail again and if it fails pull the mapping from the ES instance you run this against? -> curl -XGET 'http://localhost:9200/tag/_mapping' I'd be interested if the index gets created and the template is not applied to it. Maybe there is a race in the template creation code. I try to come up with a testcase for this and stress it a little next week. simon
-- On Friday, February 15, 2013 8:14:36 PM UTC+1, Chris K Wensel wrote:
You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
from the data that produces the last case, it is indeed missing curl -XGET 'http://localhost:9200/tag/_mapping' {"error":"IndexMissingException[[tag] missing]","status":404} that said, per my original email, it is not missing, when I see the test failures, i've double checked the mappings existence, further, not all documents (tags) are mis-parsed. i'll try and dig deeper into the es code at some point. ckw On Feb 16, 2013, at 7:00 AM, simonw <[hidden email]> wrote: Hey Chris, -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
hmm I think my like was broken it should be 'tags' not 'tag' given the gist, right?
simon
-- On Sunday, February 17, 2013 2:23:49 AM UTC+1, Chris K Wensel wrote:
You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
|
oops, crap, and I wiped the data for that. i added set -e to make sure the server was fully up and i'm not reproducing the problem now via the bash i'll see if I can get a replay of the calls during our tests and try to reproduce independently of the test harness. ckw On Feb 17, 2013, at 10:35 AM, simonw <[hidden email]> wrote: hmm I think my like was broken it should be 'tags' not 'tag' given the gist, right? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. For more options, visit https://groups.google.com/groups/opt_out. |
| Powered by Nabble | Edit this page |
