Test Harness for ElasticSearch

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Test Harness for ElasticSearch

pulkitsinghal
Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :)
Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

Ronak Patel
I built my own to do this using a local node to handle basic integration testing.
You can probably fire up your backend webapp (the one that talks to ES) and use something like Apache JMeter to handle load testing against your webapp.

On Wednesday, February 29, 2012 11:17:13 AM UTC-5, pulkitsinghal wrote:
Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :)
Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

Nick Dimiduk
In reply to this post by pulkitsinghal
I just threw together a simple multi-threaded load script. It makes assumptions about our deployment all over the place, but does the job. I looked briefly at jmeter and will likely move to that tool when i find a few minutes to study it's use.

-n

On Wed, Feb 29, 2012 at 8:17 AM, pulkitsinghal <[hidden email]> wrote:
Is there any generic test harness available for configuring and
running different types of queries with concurrent users against our
own datasets? I want to check with the community at large before I
start rolling my own :)

Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

Otis Gospodnetic
In reply to this post by pulkitsinghal
Hi,

Have you considered simply using JMeter?
We use it regularly when doing performance testing against
ElasticSearch or Solr.

Otis
--
Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html


On Mar 1, 12:17 am, pulkitsinghal <[hidden email]> wrote:
> Is there any generic test harness available for configuring and
> running different types of queries with concurrent users against our
> own datasets? I want to check with the community at large before I
> start rolling my own :)
Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

pulkitsinghal
Yup JMeter seems to be the goto solution, I've started on it. But if there is any other advice, please keep the comments coming :)

On Wed, Feb 29, 2012 at 11:57 PM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

Have you considered simply using JMeter?
We use it regularly when doing performance testing against
ElasticSearch or Solr.

Otis
--
Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html


On Mar 1, 12:17 am, pulkitsinghal <[hidden email]> wrote:
> Is there any generic test harness available for configuring and
> running different types of queries with concurrent users against our
> own datasets? I want to check with the community at large before I
> start rolling my own :)

Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

Michael Sick
I've used SoapUI a good deal for creating SOAP based test clients and the same organization sponsors LoadUI which leverages your base tests for load testing & reporting. They claim decent support of REST. Can't vouch for it but it's probably worth a look. 

On Thu, Mar 1, 2012 at 9:06 AM, Pulkit Singhal <[hidden email]> wrote:
Yup JMeter seems to be the goto solution, I've started on it. But if there is any other advice, please keep the comments coming :)

On Wed, Feb 29, 2012 at 11:57 PM, Otis Gospodnetic <[hidden email]> wrote:
Hi,

Have you considered simply using JMeter?
We use it regularly when doing performance testing against
ElasticSearch or Solr.

Otis
--
Sematext is Hiring World-Wide -- http://sematext.com/about/jobs.html


On Mar 1, 12:17 am, pulkitsinghal <[hidden email]> wrote:
> Is there any generic test harness available for configuring and
> running different types of queries with concurrent users against our
> own datasets? I want to check with the community at large before I
> start rolling my own :)


Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

Jan Fiedler
In reply to this post by pulkitsinghal
I have been using XLT (http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of load testing. Its especially nice if your home (and preferred ES API) is Java as you write your load scripts in Java as JUnit test.
Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

pulkitsinghal
I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:
- JMeter
- SoapUI
- XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :)

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler <[hidden email]> wrote:
> I have been using XLT (http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
> load testing. Its especially nice if your home (and preferred ES API) is
> Java as you write your load scripts in Java as JUnit test.
Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

pulkitsinghal
One criteria for testing is to assume that a certain number of users that are searching against a particular field (for example product_name) in parallel are all using unique terms.

Therefore, I would like to use my search index to gather all the unique terms that are seen during indexing from a particular field ... and use this as a dictionary in a JMeter test to assign all unique search words to different threads/users and figure out what the performance would be like in this scenario ... where users don't have the benefit of searching for similar terms whose results have already been cached.

I think that the Lucene toolkit already provides a SpellChecker module that does something similar but I'm wondering:
1) Does ElasticSearch already has this capability baked-in somewhere and it is as easy as making the right call? If so, please point me to it.
2) If I used Lucene's SpellChecker module to point to a ES built index then can I expect to be able to simply read it w/o any locking issues while ES is running?
3) Rather than locating, building a path & feeding it in as a Directory into SpellChecker ... would there happen be a better integration point from which to leverage the ES built indices in code?

Thanks!
- Pulkit

On Fri, Mar 2, 2012 at 7:48 AM, pulkitsinghal <[hidden email]> wrote:
I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:
- JMeter
- SoapUI
- XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :)

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler <[hidden email]> wrote:
> I have been using XLT (http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
> load testing. Its especially nice if your home (and preferred ES API) is
> Java as you write your load scripts in Java as JUnit test.

Reply | Threaded
Open this post in threaded view
|

Re: Test Harness for ElasticSearch

Otis Gospodnetic
Hi,

If your goal is really to build a dictionary of unique terms, then I would not even think about Lucene Spellchecker because I can think of 2 simple ways of getting this dictionary:

1) Look for "words" file on any Linux machine and use that.  Here is from my local machine:
$ tail -2 /usr/share/dict/words
étude's
études
$ wc -l /usr/share/dict/words
98569 /usr/share/dict/words

You could point JMeter to that.

2) Just do a *:* style query of scan against your ES cluster, return some text field, store it, and then parse it with something to put a word per line to feed to JMeter.

Also, non-repeating queries is not common, so make sure this is really what would happen in your env.
Plus, using only terms from your index may also not be realistic - people sometimes use words that do not exist in the index (and get 0 hits).

Otis
--
Hiring ElasticSearch Engineers World-Wide -- http://sematext.com/about/jobs.html#search



On Thursday, March 8, 2012 1:48:48 PM UTC+8, pulkitsinghal wrote:
One criteria for testing is to assume that a certain number of users that are searching against a particular field (for example product_name) in parallel are all using unique terms.

Therefore, I would like to use my search index to gather all the unique terms that are seen during indexing from a particular field ... and use this as a dictionary in a JMeter test to assign all unique search words to different threads/users and figure out what the performance would be like in this scenario ... where users don't have the benefit of searching for similar terms whose results have already been cached.

I think that the Lucene toolkit already provides a SpellChecker module that does something similar but I'm wondering:
1) Does ElasticSearch already has this capability baked-in somewhere and it is as easy as making the right call? If so, please point me to it.
2) If I used Lucene's SpellChecker module to point to a ES built index then can I expect to be able to simply read it w/o any locking issues while ES is running?
3) Rather than locating, building a path & feeding it in as a Directory into SpellChecker ... would there happen be a better integration point from which to leverage the ES built indices in code?

Thanks!
- Pulkit

On Fri, Mar 2, 2012 at 7:48 AM, pulkitsinghal <[hidden email]> wrote:
I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:
- JMeter
- SoapUI
- XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :)

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler <[hidden email]> wrote:
> I have been using XLT (http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
> load testing. Its especially nice if your home (and preferred ES API) is
> Java as you write your load scripts in Java as JUnit test.


On Thursday, March 8, 2012 1:48:48 PM UTC+8, pulkitsinghal wrote:
One criteria for testing is to assume that a certain number of users that are searching against a particular field (for example product_name) in parallel are all using unique terms.

Therefore, I would like to use my search index to gather all the unique terms that are seen during indexing from a particular field ... and use this as a dictionary in a JMeter test to assign all unique search words to different threads/users and figure out what the performance would be like in this scenario ... where users don't have the benefit of searching for similar terms whose results have already been cached.

I think that the Lucene toolkit already provides a SpellChecker module that does something similar but I'm wondering:
1) Does ElasticSearch already has this capability baked-in somewhere and it is as easy as making the right call? If so, please point me to it.
2) If I used Lucene's SpellChecker module to point to a ES built index then can I expect to be able to simply read it w/o any locking issues while ES is running?
3) Rather than locating, building a path & feeding it in as a Directory into SpellChecker ... would there happen be a better integration point from which to leverage the ES built indices in code?

Thanks!
- Pulkit

On Fri, Mar 2, 2012 at 7:48 AM, pulkitsinghal <[hidden email]> wrote:
I am aiming for the ability to reuse a template on specific datasets
and user workflows to find out the performance for each unique
ecosystem (data + types of queries + # of parallel queries for each
type + users + use-cases etc.) and what it means to come up with a
formula so that one is ready to scale ES.

To that end, I found everyone's suggestions have been really useful, I
count:
- JMeter
- SoapUI
- XLT

I will get started on a template for JMeter based test-harness, and
will post back here when I have a prototype. If someone does the same
using the other tools, I would be thankful & welcome any sharing on
your part too :)

Cheers!

On Mar 1, 2:11 pm, Jan Fiedler <[hidden email]> wrote:
> I have been using XLT (http://www.xceptance.com/products/xlt/what-is-xlt.html) for all sorts of
> load testing. Its especially nice if your home (and preferred ES API) is
> Java as you write your load scripts in Java as JUnit test.