some ES stats (at yfrog.com)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

some ES stats (at yfrog.com)

jacque74
Hello all, I thought I would share some stats on how we are using
ElasticSearch.

At yfrog, we are indexing corpus of tweets that users use in
conjunction with photos uploaded to twitter from mobile and web.  Our
total index is about 950 million docs.  We run 500 shards across 12
servers, which are 8 core x 16GB, and some are 12 core x 32GB boxes.
Routing allows us to provide timeline search (per user), so that each
user's data is always in one shard.  Here is a screenshot btw:
http://a.yfrog.com/img703/8096/xil.png

Each shard is about 1G large, which makes total data to be about 1TB
with 50% of being replicated (e.g. 1 replica). The other interesting
parts is that each server runs with 4x2TB disks, with ZFS over them.
So far we have had no issues with ZFS on linux, and servers have had
definite performance improvement.

We are currently indexing 3ml new docs per day (new tweets).  So far
the challenges have been to keep file descriptors low, we are
basically running with ulimit -n 256000.  Merging down to appropriate
number of segments under shard is always a must, but that
significantly taxes CPU, and slows down bulk indexing should we ever
try to re-index the whole corpus.

On the other hand, as you can see from the screenshot the read search
performance is just awesome, you can try some searches yourself at
http://yfrog.com/user/TechCrunch/profile

-Jack
Reply | Threaded
Open this post in threaded view
|

Re: some ES stats (at yfrog.com)

Clinton Gormley-2
Hiya Jack

On Sun, 2012-02-19 at 21:58 -0800, Jack Levin wrote:
> Hello all, I thought I would share some stats on how we are using
> ElasticSearch.

great post - thanks for sharing.

> The other interesting
> parts is that each server runs with 4x2TB disks, with ZFS over them.
> So far we have had no issues with ZFS on linux, and servers have had
> definite performance improvement.

That's very interesting. I hadn't seen that native ZFS was available for
linux.  A link for the interested: http://zfsonlinux.org/

clint

Reply | Threaded
Open this post in threaded view
|

Re: some ES stats (at yfrog.com)

dadoonet
In reply to this post by jacque74
Thanks for sharing this.

It would be nice to add it here : http://www.elasticsearch.org/users/

David ;-)
@dadoonet


Le 20 févr. 2012 à 06:58, Jack Levin <[hidden email]> a écrit :

> Hello all, I thought I would share some stats on how we are using
> ElasticSearch.
>
> At yfrog, we are indexing corpus of tweets that users use in
> conjunction with photos uploaded to twitter from mobile and web.  Our
> total index is about 950 million docs.  We run 500 shards across 12
> servers, which are 8 core x 16GB, and some are 12 core x 32GB boxes.
> Routing allows us to provide timeline search (per user), so that each
> user's data is always in one shard.  Here is a screenshot btw:
> http://a.yfrog.com/img703/8096/xil.png
>
> Each shard is about 1G large, which makes total data to be about 1TB
> with 50% of being replicated (e.g. 1 replica). The other interesting
> parts is that each server runs with 4x2TB disks, with ZFS over them.
> So far we have had no issues with ZFS on linux, and servers have had
> definite performance improvement.
>
> We are currently indexing 3ml new docs per day (new tweets).  So far
> the challenges have been to keep file descriptors low, we are
> basically running with ulimit -n 256000.  Merging down to appropriate
> number of segments under shard is always a must, but that
> significantly taxes CPU, and slows down bulk indexing should we ever
> try to re-index the whole corpus.
>
> On the other hand, as you can see from the screenshot the read search
> performance is just awesome, you can try some searches yourself at
> http://yfrog.com/user/TechCrunch/profile
>
> -Jack
Reply | Threaded
Open this post in threaded view
|

Re: some ES stats (at yfrog.com)

Clinton Gormley-2
In reply to this post by jacque74
Hiya Jack

On Sun, 2012-02-19 at 21:58 -0800, Jack Levin wrote:
> Hello all, I thought I would share some stats on how we are using
> ElasticSearch.

great post - thanks for sharing.

> The other interesting
> parts is that each server runs with 4x2TB disks, with ZFS over them.
> So far we have had no issues with ZFS on linux, and servers have had
> definite performance improvement.

That's very interesting. I hadn't seen that native ZFS was available for
linux. A link for the interested: http://zfsonlinux.org/

clint

Reply | Threaded
Open this post in threaded view
|

Re: some ES stats (at yfrog.com)

kimchy
Administrator
In reply to this post by jacque74
Jack, really thanks for sharing the data!. Its really important for people to share this type of info with elasticsearch users.

Regarding the segments, by default, elasticsearch uses the tiered merge policy (http://www.elasticsearch.org/guide/reference/index-modules/merge.html). The important thing to remember about it is that it has (an estimated) max size bound on a segment. The max_merged_segment is the setting, and it defaults to 5gb (same as in Lucene). This can cause for many segments to be created for a large index. I have been back and forth and possibly increasing this default value to a higher value (which will cause more merges, but less resources being used - file descriptors, memory, faster searches).

One thing that you can use to gauge it is the segments API, which returns detailed data on segments used in each shard of an index. Lukas has been doing amazing job with big desk, I hope that we will be able to improve on it and move it to start providing "index" level stats on top of node level focused stats that it provides today. One of the things we will do then is to provide a nice visualization of the segments each shard has, so people can decide what good values can eb set (all can eb set in realtime on a live index), or even when to issue optimization calls more easily.

On Monday, February 20, 2012 at 7:58 AM, Jack Levin wrote:

Hello all, I thought I would share some stats on how we are using
ElasticSearch.

At yfrog, we are indexing corpus of tweets that users use in
conjunction with photos uploaded to twitter from mobile and web. Our
total index is about 950 million docs. We run 500 shards across 12
servers, which are 8 core x 16GB, and some are 12 core x 32GB boxes.
Routing allows us to provide timeline search (per user), so that each
user's data is always in one shard. Here is a screenshot btw:

Each shard is about 1G large, which makes total data to be about 1TB
with 50% of being replicated (e.g. 1 replica). The other interesting
parts is that each server runs with 4x2TB disks, with ZFS over them.
So far we have had no issues with ZFS on linux, and servers have had
definite performance improvement.

We are currently indexing 3ml new docs per day (new tweets). So far
the challenges have been to keep file descriptors low, we are
basically running with ulimit -n 256000. Merging down to appropriate
number of segments under shard is always a must, but that
significantly taxes CPU, and slows down bulk indexing should we ever
try to re-index the whole corpus.

On the other hand, as you can see from the screenshot the read search
performance is just awesome, you can try some searches yourself at

-Jack

Reply | Threaded
Open this post in threaded view
|

Re: some ES stats (at yfrog.com)

Nick Dimiduk
In reply to this post by jacque74
Hi Jack,

Thanks for sharing! I'm curious: can you provide any throughput
numbers? Ie, how many concurrent user requests you're able to serve
from this cluster and rough latency numbers. Do you use the geospatial
capacities at all?

Thanks,
Nick

On Feb 19, 9:58 pm, Jack Levin <[hidden email]> wrote:

> Hello all, I thought I would share some stats on how we are using
> ElasticSearch.
>
> At yfrog, we are indexing corpus of tweets that users use in
> conjunction with photos uploaded to twitter from mobile and web.  Our
> total index is about 950 million docs.  We run 500 shards across 12
> servers, which are 8 core x 16GB, and some are 12 core x 32GB boxes.
> Routing allows us to provide timeline search (per user), so that each
> user's data is always in one shard.  Here is a screenshot btw:http://a.yfrog.com/img703/8096/xil.png
>
> Each shard is about 1G large, which makes total data to be about 1TB
> with 50% of being replicated (e.g. 1 replica). The other interesting
> parts is that each server runs with 4x2TB disks, with ZFS over them.
> So far we have had no issues with ZFS on linux, and servers have had
> definite performance improvement.
>
> We are currently indexing 3ml new docs per day (new tweets).  So far
> the challenges have been to keep file descriptors low, we are
> basically running with ulimit -n 256000.  Merging down to appropriate
> number of segments under shard is always a must, but that
> significantly taxes CPU, and slows down bulk indexing should we ever
> try to re-index the whole corpus.
>
> On the other hand, as you can see from the screenshot the read search
> performance is just awesome, you can try some searches yourself athttp://yfrog.com/user/TechCrunch/profile
>
> -Jack