How to get the result based on custom sorting in elasticsearch?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

How to get the result based on custom sorting in elasticsearch?

santhosh
This post has NOT been accepted by the mailing list yet.
We have a use case to get the result based on custom sorting from Elasticsearch.
I am using Elasticsearch v 5.1.2.

Mapping

client_obj.indices.create(:index=>'test',:body=>{:mappings=>{:texts=>{:properties=>{:number=>{:type=>"integer"},:text=>{:type=>"text",:term_vector=>"with_positions_offsets_payloads"}}}}})

arr = [1,3,200,100,2,10 ...] # 1million entry

From array(arr), I am expecting results as the number ordered in an array from Elasticsearch. I used below API to get the results. It worked for the small set of numbers but if the array size is more than 500k then function block in API will increase and my ES server is going down.

#from and size value will very based on page number and size
GET /test/_search
{
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"constant_score": {
"query": {
"bool": {
"must": [ { "terms": { "number" : [1,3,200,100,2] }},
{"query_string" : { "query" : "#{keyword}" ,"default_field" : "text"}}
]
}
}
}
},
"functions": [
{ "filter": { "term": { "number": 1 } }, "weight" : 4 },
{ "filter": { "term": { "number": 3 } }, "weight" : 3 },
{ "filter": { "term": { "number": 200 } }, "weight" : 2 },
{ "filter": { "term": { "number": 100 } }, "weight" : 1 },
{ "filter": { "term": { "number": 2 } }, "weight" : 0 }
]
}
},
"_source": ["number"],
"size": 3,
"from": 0
}

I am getting below error if the same API called for 500k numbers

   [2017-03-30T09:01:08,898][INFO ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][169] overhead, spent [269ms] collecting in the last [1s]
[2017-03-30T09:01:14,604][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][172] overhead, spent [3s] collecting in the last [3.6s]
[2017-03-30T09:01:21,718][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][176] overhead, spent [3.3s] collecting in the last [3.8s]
[2017-03-30T09:01:30,561][INFO ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][old][179][6] duration [6.1s], collections [1]/[6.5s], total [6.1s]/[12.5s], memory [2.9gb]->[2.8gb]/[2.9gb], all_pools {[young] [266.2mb]->[214mb]/[266.2mb]}{[survivor] [21.2mb]->[0b]/[33.2mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-03-30T09:01:30,565][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][179] overhead, spent [6.1s] collecting in the last [6.5s]
[2017-03-30T09:01:37,033][INFO ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][old][180][7] duration [5s], collections [1]/[5.5s], total [5s]/[17.6s], memory [2.8gb]->[2.9gb]/[2.9gb], all_pools {[young] [214mb]->[266.2mb]/[266.2mb]}{[survivor] [0b]->[1.2mb]/[33.2mb]}{[old] [2.6gb]->[2.6gb]/[2.6gb]}
[2017-03-30T09:01:47,708][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][183] overhead, spent [3s] collecting in the last [3s]
[2017-03-30T09:01:49,939][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][184] overhead, spent [2.2s] collecting in the last [2.2s]
[2017-03-30T09:02:04,746][WARN ][o.e.m.j.JvmGcMonitorService] [BYOiXkA] [gc][185] overhead, spent [5.5s] collecting in the last [5.5s]
[2017-03-30T09:02:24,145][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [] fatal error in thread [elasticsearch[BYOiXkA][search][T#4]], exiting
java.lang.OutOfMemoryError: Java heap space
        at org.apache.lucene.util.FixedBitSet.<init>(FixedBitSet.java:115) ~[lucene-core-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:11]
        at org.apache.lucene.util.DocIdSetBuilder.upgradeToBitSet(DocIdSetBuilder.java:235) ~[lucene-core-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:11]
        at org.apache.lucene.util.DocIdSetBuilder.grow(DocIdSetBuilder.java:178) ~[lucene-core-6.3.0.jar:6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02 19:47:11]


I have following queries

Do we have any other way to solve this problem?
Can we use the script in ES API to solve this problem? if yes how to do that?

Please help me to solve this problem