Scaling to 150k/sec

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Scaling to 150k/sec

Janet Sullivan

I currently have a proof of concept cluster handling about 12000 msgs/sec.  If a certain project kicks off, I would need to scale 10+ times.  Is anyone successfully running ES at over 150k/sec?  What kind of shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7e31b4346a994d20939c349fb3c002fd%40BN1PR07MB039.namprd07.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

jplock
We've gotten a cluster up to 40K/sec using roughly 40 nodes.  We're going to switch over to using dedicated master nodes as well.  This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000 msgs/sec.  If a certain project kicks off, I would need to scale 10+ times.  Is anyone successfully running ES at over 150k/sec?  What kind of shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

Mark Walkom
We are getting ~15K/s with 12 data + 3 master nodes, latest version of java and ES.

Some things to try would be;
  • Java 8
  • G1GC
  • SSD storage
  • tweaking various index and pool caches
  • optimising your data inputs, mappings etc
We're trialling the first two of those on our dev cluster, but it doesn't do much traffic so I cannot empirically comment on it's capabilities at the levels you're after as yet.

What does your current setup (ie infrastructure) look like now?


Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 17 March 2014 13:17, <[hidden email]> wrote:
We've gotten a cluster up to 40K/sec using roughly 40 nodes.  We're going to switch over to using dedicated master nodes as well.  This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000 msgs/sec.  If a certain project kicks off, I would need to scale 10+ times.  Is anyone successfully running ES at over 150k/sec?  What kind of shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z4iTt0cSKZG0ton9tZLuhBWP9WDBj6eJ7Xi%2B1jonniDg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

John Arnold (GNS)
We haven't tried Java 8 or changing garbage collection yet -- I've heard mixed results on GC.    We're using SSD storage on Azure, and have a fairly tweaked out config.  I was thinking about turning off any kind of analysis in the template to get it to scale...

On Sunday, March 16, 2014 7:37:09 PM UTC-7, Mark Walkom wrote:
We are getting ~15K/s with 12 data + 3 master nodes, latest version of java and ES.

Some things to try would be;
  • Java 8
  • G1GC
  • SSD storage
  • tweaking various index and pool caches
  • optimising your data inputs, mappings etc
We're trialling the first two of those on our dev cluster, but it doesn't do much traffic so I cannot empirically comment on it's capabilities at the levels you're after as yet.

What does your current setup (ie infrastructure) look like now?


Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: <a onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" href="javascript:" target="_blank" gdf-obfuscated-mailto="5DqcdSrl_ncJ">ma...@...
web: <a onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.campaignmonitor.com\46sa\75D\46sntz\0751\46usg\75AFQjCNFv30c-WBiP6sfBmxXaWBP5YBZg1Q';return true;" href="http://www.campaignmonitor.com" target="_blank">www.campaignmonitor.com


On 17 March 2014 13:17, <<a onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" href="javascript:" target="_blank" gdf-obfuscated-mailto="5DqcdSrl_ncJ">jpl...@...> wrote:
We've gotten a cluster up to 40K/sec using roughly 40 nodes.  We're going to switch over to using dedicated master nodes as well.  This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000 msgs/sec.  If a certain project kicks off, I would need to scale 10+ times.  Is anyone successfully running ES at over 150k/sec?  What kind of shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" href="javascript:" target="_blank" gdf-obfuscated-mailto="5DqcdSrl_ncJ">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" href="https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank">https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com.
For more options, visit <a onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;" href="https://groups.google.com/d/optout" target="_blank">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f430f4b-369f-4db7-bb00-669455ca63f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

Mark Walkom
There's a bunch of kernel/OS tweaks you can apply as well. eg noatime, nodiratime if you mount the ES dir on it's own, or some of these http://namhuy.net/1563/how-to-tweak-and-optimize-ssd-for-ubuntu-linux-mint.html
Then indices.store.throttle.max_bytes_per_sec might be worth looking at, you can increase that from the 20mb default.

I know there'd be a bunch of people that would be interested in your setup and the tweaks you've done if you're interested in putting up a blog post/gist with it.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 18 March 2014 07:43, <[hidden email]> wrote:
We haven't tried Java 8 or changing garbage collection yet -- I've heard mixed results on GC.    We're using SSD storage on Azure, and have a fairly tweaked out config.  I was thinking about turning off any kind of analysis in the template to get it to scale...

On Sunday, March 16, 2014 7:37:09 PM UTC-7, Mark Walkom wrote:
We are getting ~15K/s with 12 data + 3 master nodes, latest version of java and ES.

Some things to try would be;
  • Java 8
  • G1GC
  • SSD storage
  • tweaking various index and pool caches
  • optimising your data inputs, mappings etc
We're trialling the first two of those on our dev cluster, but it doesn't do much traffic so I cannot empirically comment on it's capabilities at the levels you're after as yet.

What does your current setup (ie infrastructure) look like now?


Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 17 March 2014 13:17, <[hidden email]> wrote:
We've gotten a cluster up to 40K/sec using roughly 40 nodes.  We're going to switch over to using dedicated master nodes as well.  This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000 msgs/sec.  If a certain project kicks off, I would need to scale 10+ times.  Is anyone successfully running ES at over 150k/sec?  What kind of shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f430f4b-369f-4db7-bb00-669455ca63f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZrQTpgew8ttC2SHKyZv_dLoyp5%2BGToHC8mr4NmH8SosQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

Soumitra Kumar
In reply to this post by John Arnold (GNS)
I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and need help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

Mohit Anchlia
It would be helpful to see the document size along with the parameters.
On Mon, Mar 17, 2014 at 2:27 PM, <[hidden email]> wrote:
I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and need help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWq6RfPtWRTWZnwr_sodAhAPO0Ptbuo6g7Nz0uSAwUWNLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

John Arnold (GNS)
We have a bunch of different things going into elasticsearch, mostly network-related telemetry.  The latest painful one is IPFIX (netflow).  The aggregation we're working on looks like this:

{
  "_index": "ipfix-2014.03.17",
  "_type": "logs",
  "_id": "oA5uhOc9QpCaH9iGPlY7AQ",
  "_score": null,
  "_source": {
    "peer_ip_src": "207.46.32.122",
    "ip_dst": "157.56.106.160",
    "as_src": 12670,
    "mask_dst": 27,
    "as_path": "",
    "ip_src": "92.102.0.0",
    "bytes": 80,
    "port_dst": 3544,
    "mask_src": 16,
    "stamp_inserted": "2014-03-17 09:18:00",
    "stamp_updated": "2014-03-17 10:31:28",
    "packets": 1,
    "@version": "1",
    "@timestamp": "2014-03-17T12:19:27.244Z",
    "source": "ipfix"
  },
  "sort": [
    1395058767244
  ]
}


There's just a really stupid high volume of really small messages that need little in the way of analysis (which we need to turn off).



On Monday, March 17, 2014 2:45:04 PM UTC-7, Mo wrote:
It would be helpful to see the document size along with the parameters.
On Mon, Mar 17, 2014 at 2:27 PM, <<a onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" href="javascript:" target="_blank" gdf-obfuscated-mailto="ngRdSrLCoKAJ">kumar.s...@...> wrote:
I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and need help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" href="javascript:" target="_blank" gdf-obfuscated-mailto="ngRdSrLCoKAJ">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" href="https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank">https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com.

For more options, visit <a onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;" href="https://groups.google.com/d/optout" target="_blank">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ec3c13be-7194-404a-b962-1a3c7612fc07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Scaling to 150k/sec

Otis Gospodnetic
Hi,

These are nice and small and require no analysis.  Turn of _all, tweak merge rate, use high refresh interval, give ES/Lucene a good buffer, look at xa log flush settings, etc. and you should be able to get to 150K/sec without requiring dozens of servers.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



On Monday, March 17, 2014 11:10:16 PM UTC-4, [hidden email] wrote:
We have a bunch of different things going into elasticsearch, mostly network-related telemetry.  The latest painful one is IPFIX (netflow).  The aggregation we're working on looks like this:

{
  "_index": "ipfix-2014.03.17",
  "_type": "logs",
  "_id": "oA5uhOc9QpCaH9iGPlY7AQ",
  "_score": null,
  "_source": {
    "peer_ip_src": "207.46.32.122",
    "ip_dst": "157.56.106.160",
    "as_src": 12670,
    "mask_dst": 27,
    "as_path": "",
    "ip_src": "92.102.0.0",
    "bytes": 80,
    "port_dst": 3544,
    "mask_src": 16,
    "stamp_inserted": "2014-03-17 09:18:00",
    "stamp_updated": "2014-03-17 10:31:28",
    "packets": 1,
    "@version": "1",
    "@timestamp": "2014-03-17T12:19:27.244Z",
    "source": "ipfix"
  },
  "sort": [
    1395058767244
  ]
}


There's just a really stupid high volume of really small messages that need little in the way of analysis (which we need to turn off).



On Monday, March 17, 2014 2:45:04 PM UTC-7, Mo wrote:
It would be helpful to see the document size along with the parameters.
On Mon, Mar 17, 2014 at 2:27 PM, <[hidden email]> wrote:
I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and need help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aaa05eca-ff6a-4b8b-a8ba-9ded45d0c939%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.