Performance of term query with sorting

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Performance of term query with sorting

Ankit Jain
Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32 GB RAM and 8 cores. 

we have indexed 60 millions(100 GB data) records into ES. We need to fire a term query with sorting.

If we are firing term query without sorting, then result are coming in 3 secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Performance of term query with sorting

q42jaap
Ankit,

Could you please give insight in the layout of your data and what the query looks like?
How big is the resultset you're hitting with the query, total_hits.

For more insight into how querying is done, have a look at:

I believe the default search type is query_then_fetch, however this is missing in the documentation.


Jaap Taal
 
[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl | Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK 30164662 ]


On Thu, Apr 4, 2013 at 12:42 PM, Ankit Jain <[hidden email]> wrote:
Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32 GB RAM and 8 cores. 

we have indexed 60 millions(100 GB data) records into ES. We need to fire a term query with sorting.

If we are firing term query without sorting, then result are coming in 3 secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Performance of term query with sorting

Ankit Jain

Hi Jaap,

we have indexed 60 millions records, each record contains 31 columns(rowId,c0,c1,c2,c3..,c29).

Below is our index mapping:

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}


Below is sample example that we are using to retrieve 10000 records from elasticsearch.

public void SearchQuery() {
QueryBuilder qb = QueryBuilders.matchAllQuery();

QueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(termQuery("c29", "udp"))
;


SearchRequestBuilder searchRequestBuilder = client
.prepareSearch("89854","89855","89853")
.setSearchType(SearchType.QUERY_AND_FETCH)
.setQuery(queryBuilder)
.setSize(10000);
searchRequestBuilder.addSort("c0", SortOrder.DESC);
SearchResponse response = searchRequestBuilder.execute().actionGet();
SearchHits hits = response.getHits();
System.out.println("Total Hits : "+hits.getTotalHits()); // output is 2
int i = 0;
for (SearchHit hit : hits) {
System.out.println("id = " + hit.getId() + ""+i++); // prints out the id of the
}

long now = System.currentTimeMillis();
long diff = now - start;
Calendar cal = Calendar.getInstance();
SimpleDateFormat sdf = new SimpleDateFormat(DATE_FORMAT_NOW);

System.out.println("Time Taken in millisecs = "
+ new Long(diff).toString());
System.out.println("Done");
}


Thanks & Regards,
Ankit Jain


 



On Thursday, 4 April 2013 17:00:52 UTC+5:30, Jaap Taal wrote:
Ankit,

Could you please give insight in the layout of your data and what the query looks like?
How big is the resultset you're hitting with the query, total_hits.

For more insight into how querying is done, have a look at:

I believe the default search type is query_then_fetch, however this is missing in the documentation.


Jaap Taal
 
[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl | Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK 30164662 ]


On Thu, Apr 4, 2013 at 12:42 PM, Ankit Jain <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="Fg4QFjW81-cJ">ankitj...@...> wrote:
Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32 GB RAM and 8 cores. 

we have indexed 60 millions(100 GB data) records into ES. We need to fire a term query with sorting.

If we are firing term query without sorting, then result are coming in 3 secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="Fg4QFjW81-cJ">elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Performance of term query with sorting

dadoonet
If you need to extract so much records, prefer the scan & scroll feature.
You probably don't need to display so mush results to a single user, do you?

That said, when sorting, ES has to load all values from your field c0 and sort them on each shard. Then, sort the resultset again on the gathering node.
It could explain things here if c0 has 60 000 000 different values!

-- 
David Pilato | Technical Advocate | Elasticsearch.com



Le 4 avr. 2013 à 14:33, Ankit Jain <[hidden email]> a écrit :


Hi Jaap,

we have indexed 60 millions records, each record contains 31 columns(rowId,c0,c1,c2,c3..,c29).

Below is our index mapping:

{"ipdr":{"_source":{"enabled":false},"properties":{"c0":{"type":"long","store":"yes"},"c1":{"type":"string"},"c11":{"type":"string"},"c12":{"type":"string"},"c13":{"type":"string"},"c14":{"type":"string"},"c15":{"type":"string"},"c16":{"type":"string"},"c17":{"type":"string"},"c18":{"type":"string"},"c19":{"type":"string"},"c2":{"type":"string"},"c20":{"type":"string"},"c21":{"type":"string"},"c22":{"type":"string"},"c23":{"type":"string"},"c24":{"type":"string"},"c25":{"type":"string"},"c26":{"type":"string"},"c27":{"type":"string"},"c28":{"type":"string"},"c29":{"type":"string"},"c3":{"type":"string"},"c30":{"type":"string"},"c4":{"type":"string"},"c5":{"type":"string"},"c6":{"type":"string"},"c7":{"type":"string"},"c8":{"type":"string"},"c9":{"type":"string"},"rowId":{"type":"string","index":"no","store":"yes"}}}}


Below is sample example that we are using to retrieve 10000 records from elasticsearch.

public void SearchQuery() {
QueryBuilder qb = QueryBuilders.matchAllQuery();

QueryBuilder queryBuilder = QueryBuilders.boolQuery()
.must(termQuery("c29", "udp"))
;


SearchRequestBuilder searchRequestBuilder = client
.prepareSearch("89854","89855","89853")
.setSearchType(SearchType.QUERY_AND_FETCH)
.setQuery(queryBuilder)
.setSize(10000);
searchRequestBuilder.addSort("c0", SortOrder.DESC);
SearchResponse response = searchRequestBuilder.execute().actionGet();
SearchHits hits = response.getHits();
System.out.println("Total Hits : "+hits.getTotalHits()); // output is 2
int i = 0;
for (SearchHit hit : hits) {
System.out.println("id = " + hit.getId() + ""+i++); // prints out the id of the
}

long now = System.currentTimeMillis();
long diff = now - start;
Calendar cal = Calendar.getInstance();
SimpleDateFormat sdf = new SimpleDateFormat(DATE_FORMAT_NOW);

System.out.println("Time Taken in millisecs = "
+ new Long(diff).toString());
System.out.println("Done");
}


Thanks & Regards,
Ankit Jain


 



On Thursday, 4 April 2013 17:00:52 UTC+5:30, Jaap Taal wrote:
Ankit,

Could you please give insight in the layout of your data and what the query looks like?
How big is the resultset you're hitting with the query, total_hits.

For more insight into how querying is done, have a look at:

I believe the default search type is query_then_fetch, however this is missing in the documentation.


Jaap Taal
 
[ Q42 BV | tel 070 44523 42 | direct 070 44523 65 | http://q42.nl | Waldorpstraat 17F, Den Haag | Vijzelstraat 72 unit 4.23, Amsterdam | KvK 30164662 ]


On Thu, Apr 4, 2013 at 12:42 PM, Ankit Jain <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="Fg4QFjW81-cJ">ankitj...@...> wrote:
Hi All,

we are using 0.20.2 version of ES and running five nodes of ES each has 32 GB RAM and 8 cores. 

we have indexed 60 millions(100 GB data) records into ES. We need to fire a term query with sorting.

If we are firing term query without sorting, then result are coming in 3 secs ,but , a term query with sorting takes around 150 secs.
Please somebody could explain the Mechanism the elasticsearch follows while performing the query(with and without sort both) ?

we have already optimized the index via external call.

Also , please suggest some optimization parameters.

Thanks,
Ankit

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="Fg4QFjW81-cJ">elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Performance of term query with sorting

anki0808
This post has NOT been accepted by the mailing list yet.
In reply to this post by Ankit Jain
Hi Guys,

I am getting a same problem sort query on date desc  taking 4 sec.
Earlier it was used to take almost 200ms...
I have done routing also so its fetching data from single shard only.
Can you guys help me in getting the solution.