|
|
Hi,
I have a following data in ES. {
"title": "title1 of project", "organization": "XYZ company"
}, {
"title": "title2 of project", "organization": "XYZ company"
}, {
"title": "title3 of project", "organization": "ABC company"
},
I need the count of organizations as follows: "ABC company":1
"XYZ company": 2
I tried using facets but facets give the count of words
{ "query" : {"match_all": {} },
"facets" : {"organization" : {"terms" : {"field": "organization"}}}}'
gives
"facets" : { "organization" : { "_type" : "terms",
"missing" : 0, "total" : 6, "other" : 0, "terms" : [ { "term" : "company",
"count" : 3 }, { "term" : "xyz", "count" : 2 }, { "term" : "abc",
"count" : 1 } ] }
I have no idea if there are any options which checks for the whole phrase than words in the facet terms. Tried searching here and there but couldn't find anything.
Thanks Anjesh.
|
|
It basically depends on mapping. You have used default standard
analyzer, insted of that you need to use keyword analyzer.
Regards
Jagdeep
On May 2, 6:40 pm, anjesh < [hidden email]> wrote:
> Hi,
>
> I have a following data in ES.
> {
> "title": "title1 of project",
> "organization": "XYZ company"},
>
> {
> "title": "title2 of project",
> "organization": "XYZ company"},
>
> {
> "title": "title3 of project",
> "organization": "ABC company"
>
> },
>
> I need the count of organizations as follows:
> "ABC company":1
> "XYZ company": 2
>
> I tried using facets but facets give the count of words
>
> curl -X POSThttp://localhost:9200/testcompany/activity/_search?pretty=true-d'
> { "query" : {"match_all": {} },
> "facets" : {"organization" : {"terms" : {"field": "organization"}}}}'
>
> gives
>
> "facets" : {
> "organization" : {
> "_type" : "terms",
> "missing" : 0,
> "total" : 6,
> "other" : 0,
> "terms" : [ {
> "term" : "company",
> "count" : 3
> }, {
> "term" : "xyz",
> "count" : 2
> }, {
> "term" : "abc",
> "count" : 1
> } ]
> }
>
> I have no idea if there are any options which checks for the whole phrase
> than words in the facet terms.
> Tried searching here and there but couldn't find anything.
>
> Thanks
> Anjesh.
|
|
hi jagdeep
i am also facing same problem can u give me one mapping example for this implementation it would be very helpful to me...
thanx
Sumit Gupta
|
|
Hi,
I think that setting index for 'organization' to 'not_analyzed' should work like you want.
Best regards.
|
|
Hi Marcin
for getting the phrase count "not_analyzed" is not working..so if u hv any idea for searching the phrase using facet query. please help me..
Thanx
Sumit Gupta
|
|
First register custom analyzers, using your own configuration in the format like following, along with the index creation API
{ "index": { "number_of_shards": 5, "number_of_replicas": 0, "analysis": { "analyzer": { "standard1": { "type": "custom", "tokenizer": "standard", "filter": [ "standard", "lowercase" ] }, "keyword1": { "type": "custom", "tokenizer": "keyword" }, "keyword2": { "type": "pattern", "pattern": "," } } } } }
Next, use these analyzers to map to individual fields of the data you are going to post in your index. Use something like following in the update mapping API
{ "mediasource": { "properties": { "mediaSourceTypeId": { "index": "analyzed", "type": "integer" }, "isuName": { "analyzer": "keyword1", "type": "string" }, "newsCategories": { "properties": { "category": { "analyzer": "keyword1", "type": "string" }, "category_words": { "analyzer": "keyword1", "type": "string" }, "score": { "index": "analyzed", "type": "double" } } } } } }
For your data example, you have to analyze "organization" field with keyword analyzer. Just like I did for "isuName" field in my example.
On Thursday, May 3, 2012 3:45:58 PM UTC+5:30, Sumit Gupta wrote: Hi Marcin
for getting the phrase count "not_analyzed" is not working..so if u hv any
idea for searching the phrase using facet query. please help me..
Thanx
Sumit Gupta
--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/need-count-of-terms-using-facets-taking-space-into-account-tp3956699p3958739.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.
|
|
Hi sujoysett
thanks for ur quick response.
after giving the mapping that u define when we search for phrase there is no hit and searching for term the result like this...
curl -XPUT ' http://localhost:9200/my_twitter1/my_tweet/1' -d '{
"user" : "hi hello how",
"post_date" : "2011-09-20T16:20:00",
"message" : "abc xyz def abc xyz def"
}'
and when we want to apply this facet query like
curl -X POST 'localhost:9200/my_twitter1/my_tweet/_search?pretty=true' -d '{
"query": {
"term": {
"message": "abc"
}
},
"facets": {
"message": {
"terms": {
"field": "message"
}
}
}
}'
and i m getting the result for all the count of abc like "abc":1,"xyz":1,"def":1 and when we search for "abc xyz" ther is no hit..
so please help me how i can search for "abc xyz" and also find the count "abc xyz" using facet query..
thanx
Sumit Gupta
|
|
Sumit change the of this field to keyword as explained by Sujoy. By
default its using standard analyzer.
"message": {
"analyzer": "keyword",
"type": "string"
},
Regards
Jagdeep
On May 3, 6:12 pm, Sumit Guptaa < [hidden email]> wrote:
> Hi sujoysett
> thanks for ur quick response.
>
> after giving the mapping that u define when we search for phrase there is
> no hit and searching for term the result like this...
>
> curl -XPUT ' http://localhost:9200/my_twitter1/my_tweet/1'-d '{
> "user" : "hi hello how",
> "post_date" : "2011-09-20T16:20:00",
> "message" : "abc xyz def abc xyz def"
>
> }'
>
> and when we want to apply this facet query like
>
> curl -X POST 'localhost:9200/my_twitter1/my_tweet/_search?pretty=true' -d '{
> "query": {
> "term": {
> "message": "abc"
> }
> },
> "facets": {
> "message": {
> "terms": {
> "field": "message"
> }
> }
> }
>
> }'
>
> and i m getting the result for all the count of abc like
> "abc":1,"xyz":1,"def":1 and when we search for "abc xyz" ther is no
> hit..
>
> so please help me how i can search for "abc xyz" and also find the count
> "abc xyz" using facet query..
>
> thanx
> Sumit Gupta
>
> --
> View this message in context: http://elasticsearch-users.115913.n3.nabble.com/need-count-of-terms-u...
> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
|
|
Hi jagdeep
please can u give me one example for searching like "abc xyz" using facet query that give the count "abc xyz"for also...
because i am unable to get search "abc xyz" using facet query ...
Thanx,
Sumit Gupta
|
|
Sumit,
I do not think you will be able to achieve what you want without
implementing a custom tokenizer. Analyzed tokenizers will tokenize on
whitespace, and keyword analyzers take the whole term without
stemming/splitting. You need a tokenizer that tokenizes a string into
different permutations of the terms. Something like this tokenizer
must already exist, but I do not think it is part of the default
Lucene/ElasticSearch packages.
Cheers,
Ivan
On Thu, May 3, 2012 at 9:31 AM, Sumit Guptaa < [hidden email]> wrote:
|
|
hi Ivan
can u give me the full implementation for this so that i am able to perform the facet query on phrase....please help me...
Thanx,
Sumit Gupta
|
|
hi
I managed to get what i am looking for using the followings. Thanks Jagdeep. I posted in entirety - that should work.
"activity" : { "properties" : { "organization" : {"analyzer": "keyword", "type": "string"}
} } }'
"title": "title1 of project", "organization": "ABC company"
}'
"title": "title2 of project", "organization": "XYZ company"
}'
"title": "title3 of project", "organization": "XYZ company"
}'
"query" : {"match_all":{}}, "facets" : {"organization" : {"terms" : {"field": "organization"}}}
}'
gives
"facets" : { "organization" : {
"_type" : "terms", "missing" : 0, "total" : 3,
"other" : 0, "terms" : [ { "term" : "XYZ company",
"count" : 2 }, { "term" : "ABC company",
"count" : 1 } ] }
However now i can't search for ABC in organization field, as Sumit seems to be asking.
"query" : {"term":{"organization": "ABC"}}, "facets" : {"organization" : {"terms" : {"field": "organization"}}}
}'
gives 0 hits.
But
"query" : {"term":{"organization": "ABC company"}},
"facets" : {"organization" : {"terms" : {"field": "organization"}}}
}'
gives
"facets" : {
"organization" : { "_type" : "terms", "missing" : 0,
"total" : 1, "other" : 0, "terms" : [ {
"term" : "ABC company", "count" : 1 } ]
}
I think something is still missing there and i can't seem to figure it out. The search is case sensitive in this case - "abc company" doesn't give results. I don't fully understand the internals - notably tokens, analyzers. I would appreciate if somebody could point to the appropriate posts.
Best Anjesh
|
|
You either have to use pattern analyzer with case_insensitive flag as
explained here
http://www.elasticsearch.org/guide/reference/index-modules/analysis/pattern-analyzer.htmlOr you need to use regex with case_insensitive flag in your query
string
Regards
Jagdeep
On May 5, 10:05 pm, anjesh < [hidden email]> wrote:
> hi
>
> I managed to get what i am looking for using the followings. Thanks
> Jagdeep. I posted in entirety - that should work.
>
> curl -XDELETEhttp://localhost:9200/testcompany/
> curl -XPUThttp://localhost:9200/testcompany/
> curl -XPUT ' http://localhost:9200/testcompany/activity1/_mapping'-d '{
> "activity" : {
> "properties" : {
> "organization" : {"analyzer": "keyword", "type": "string"}
> }
> }}'
>
> curl -XPUThttp://localhost:9200/testcompany/activity1/1-d '{
> "title": "title1 of project",
> "organization": "ABC company"}'
>
> curl -XPUThttp://localhost:9200/testcompany/activity1/2-d '{
> "title": "title2 of project",
> "organization": "XYZ company"}'
>
> curl -XPUThttp://localhost:9200/testcompany/activity1/3-d '{
> "title": "title3 of project",
> "organization": "XYZ company"}'
>
> curl -X POSThttp://localhost:9200/testcompany/activity1/_search?pretty=true-d
> '{
> "query" : {"match_all":{}},
> "facets" : {"organization" : {"terms" : {"field": "organization"}}}
>
> }'
>
> gives
>
> "facets" : {
> "organization" : {
> "_type" : "terms",
> "missing" : 0,
> "total" : 3,
> "other" : 0,
> "terms" : [ {
> "term" : "XYZ company",
> "count" : 2
> }, {
> "term" : "ABC company",
> "count" : 1
> } ]
> }
>
> However now i can't search for ABC in organization field, as Sumit seems to
> be asking.
>
> curl -X POSThttp://localhost:9200/testcompany/activity1/_search?pretty=true-d
> '{
> "query" : {"term":{"organization": "ABC"}},
> "facets" : {"organization" : {"terms" : {"field": "organization"}}}
>
> }'
>
> gives 0 hits.
>
> But
>
> curl -X POSThttp://localhost:9200/testcompany/activity1/_search?pretty=true-d
> '{
> "query" : {"term":{"organization": "ABC company"}},
> "facets" : {"organization" : {"terms" : {"field": "organization"}}}
>
> }'
>
> gives
>
> "facets" : {
> "organization" : {
> "_type" : "terms",
> "missing" : 0,
> "total" : 1,
> "other" : 0,
> "terms" : [ {
> "term" : "ABC company",
> "count" : 1
> } ]
> }
>
> I think something is still missing there and i can't seem to figure it out.
> The search is case sensitive in this case - "abc company" doesn't give
> results. I don't fully understand the internals - notably tokens,
> analyzers. I would appreciate if somebody could point to the appropriate
> posts.
>
> Best
> Anjesh
>
> On 5 May 2012 11:51, Sumit Guptaa < [hidden email]> wrote:
>
>
>
>
>
>
>
> > hi Ivan
>
> > can u give me the full implementation for this so that i am able to perform
> > the facet query on phrase....please help me...
>
> > Thanx,
> > Sumit Gupta
>
> > --
> > View this message in context:
> > http://elasticsearch-users.115913.n3.nabble.com/need-count-of-terms-u...
> > Sent from the ElasticSearch Users mailing list archive at Nabble.com.
|
|