Problems with GrayLog2 + ES setup [long]

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Problems with GrayLog2 + ES setup [long]

MichaelGlad
I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
seems I've hit a wall. Java is CPU bound and there's next to no disk
activity.
As I believe that the cure could be tuning ES or  adding hardware
resources, I'm posting to this list.

The indexing capacity I would like is:

* Documents: long syslog mesages (160 chars avg) from a busy mail
filter
* Message load exceeding 200 messages/second
* Capacity for storing 60 days worth of log messages, that's about 1
billion .
* I currently have some 800 million mesages on disk, using about 5x140
gigs of disk.


My setup is ( I'm using 64 bit Redhat Linux)

* One VM running GrayLog2 server + a ES server with no local shards.
* Two VMs with a total of 5 shards distributed with 3 / 2 on each.

The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
environment.

I've applied the following changes to the bin/elasticsearch script:

ulimit -n 60000 # fs.file-max = 131000
ulimit -l unlimited
export ES_HEAP_SIZE=16g
export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
export JAVA_OPTS="-Xloggc:/tmp/gc"

and disabled the swap area to prevent swapping.

Java on the two VMs containing shards, use about 12 gigs out memory
and all CPU resources.
The first VM, is not significantly loaded.

What could a solution be -- "Kiwi" (kill it with Iron) or should I
rather  change the ES configuration?

Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

Radu Gheorghe
Hi Michael,

So what exactly happened? Inserts work slowly, or queries? Or both?

I'm also using Elasticsearch for logging on VMs (although I'm not
using Graylog) and my "wall" was on inserts, and it was heavily
influenced by the storage speed. You kind of need that, since I can
bet your index size is much bigger than the total size of RAM.

Some of the stuff you can do is to increase the number of shards if
you have no replicas configured (although this implies recreating the
index AFAIK). And we've also did here the following:
- increasted the refresh interval to 3
- compressed _source
- disabled _all (you need to check first if Graylog uses _all for
searching :D)

On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:

> I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> seems I've hit a wall. Java is CPU bound and there's next to no disk
> activity.
> As I believe that the cure could be tuning ES or  adding hardware
> resources, I'm posting to this list.
>
> The indexing capacity I would like is:
>
> * Documents: long syslog mesages (160 chars avg) from a busy mail
> filter
> * Message load exceeding 200 messages/second
> * Capacity for storing 60 days worth of log messages, that's about 1
> billion .
> * I currently have some 800 million mesages on disk, using about 5x140
> gigs of disk.
>
> My setup is ( I'm using 64 bit Redhat Linux)
>
> * One VM running GrayLog2 server + a ES server with no local shards.
> * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> environment.
>
> I've applied the following changes to the bin/elasticsearch script:
>
> ulimit -n 60000 # fs.file-max = 131000
> ulimit -l unlimited
> export ES_HEAP_SIZE=16g
> export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> and disabled the swap area to prevent swapping.
>
> Java on the two VMs containing shards, use about 12 gigs out memory
> and all CPU resources.
> The first VM, is not significantly loaded.
>
> What could a solution be -- "Kiwi" (kill it with Iron) or should I
> rather  change the ES configuration?
>
> Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

Michael Sick
Also, if your inserts are slow are you using bulk inserts?



On Tue, Mar 27, 2012 at 5:52 AM, Radu Gheorghe <[hidden email]> wrote:
Hi Michael,

So what exactly happened? Inserts work slowly, or queries? Or both?

I'm also using Elasticsearch for logging on VMs (although I'm not
using Graylog) and my "wall" was on inserts, and it was heavily
influenced by the storage speed. You kind of need that, since I can
bet your index size is much bigger than the total size of RAM.

Some of the stuff you can do is to increase the number of shards if
you have no replicas configured (although this implies recreating the
index AFAIK). And we've also did here the following:
- increasted the refresh interval to 3
- compressed _source
- disabled _all (you need to check first if Graylog uses _all for
searching :D)

On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
> I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> seems I've hit a wall. Java is CPU bound and there's next to no disk
> activity.
> As I believe that the cure could be tuning ES or  adding hardware
> resources, I'm posting to this list.
>
> The indexing capacity I would like is:
>
> * Documents: long syslog mesages (160 chars avg) from a busy mail
> filter
> * Message load exceeding 200 messages/second
> * Capacity for storing 60 days worth of log messages, that's about 1
> billion .
> * I currently have some 800 million mesages on disk, using about 5x140
> gigs of disk.
>
> My setup is ( I'm using 64 bit Redhat Linux)
>
> * One VM running GrayLog2 server + a ES server with no local shards.
> * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> environment.
>
> I've applied the following changes to the bin/elasticsearch script:
>
> ulimit -n 60000 # fs.file-max = 131000
> ulimit -l unlimited
> export ES_HEAP_SIZE=16g
> export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> and disabled the swap area to prevent swapping.
>
> Java on the two VMs containing shards, use about 12 gigs out memory
> and all CPU resources.
> The first VM, is not significantly loaded.
>
> What could a solution be -- "Kiwi" (kill it with Iron) or should I
> rather  change the ES configuration?
>
> Regards, Michael

Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

MichaelGlad
In reply to this post by Radu Gheorghe
Hi Raud, inserts are fast enough to keep up with the ~200 incoming
messages / sec without piling up
in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
cores ) and very
slow query response (~40-50 seconds). The storage is almost idle.

I've now per your suggestions increased refresh interval to 10 secs
and enabled compression.
I'll have to consult the GL2 sources to see if it uses _all. I've also
increased the max Java heap size to 24 out
of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
time starting up, search performance
is now acceptable (10 secs) and the two ES data nodes only use some
10-20% CPU each.

So it seems I'm back to a useful state of the world. Faster query time
would be nice though.

 - Michael

On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:

> Hi Michael,
>
> So what exactly happened? Inserts work slowly, or queries? Or both?
>
> I'm also using Elasticsearch for logging on VMs (although I'm not
> using Graylog) and my "wall" was on inserts, and it was heavily
> influenced by the storage speed. You kind of need that, since I can
> bet your index size is much bigger than the total size of RAM.
>
> Some of the stuff you can do is to increase the number of shards if
> you have no replicas configured (although this implies recreating the
> index AFAIK). And we've also did here the following:
> - increasted the refresh interval to 3
> - compressed _source
> - disabled _all (you need to check first if Graylog uses _all for
> searching :D)
>
> On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
>
>
>
>
>
>
> > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > activity.
> > As I believe that the cure could be tuning ES or  adding hardware
> > resources, I'm posting to this list.
>
> > The indexing capacity I would like is:
>
> > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > filter
> > * Message load exceeding 200 messages/second
> > * Capacity for storing 60 days worth of log messages, that's about 1
> > billion .
> > * I currently have some 800 million mesages on disk, using about 5x140
> > gigs of disk.
>
> > My setup is ( I'm using 64 bit Redhat Linux)
>
> > * One VM running GrayLog2 server + a ES server with no local shards.
> > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > environment.
>
> > I've applied the following changes to the bin/elasticsearch script:
>
> > ulimit -n 60000 # fs.file-max = 131000
> > ulimit -l unlimited
> > export ES_HEAP_SIZE=16g
> > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > and disabled the swap area to prevent swapping.
>
> > Java on the two VMs containing shards, use about 12 gigs out memory
> > and all CPU resources.
> > The first VM, is not significantly loaded.
>
> > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > rather  change the ES configuration?
>
> > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

MichaelGlad
In reply to this post by Michael Sick
I've looked at the GL2 sources at GitHub, and it seems that the bulk
API is used.

  - Michael

On 27 Mar., 14:35, Michael Sick <[hidden email]>
wrote:

> Also, if your inserts are slow are you using bulk inserts?http://www.elasticsearch.org/guide/reference/api/bulk.html
>
> On Tue, Mar 27, 2012 at 5:52 AM, Radu Gheorghe <[hidden email]>wrote:
>
>
>
>
>
>
>
> > Hi Michael,
>
> > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > I'm also using Elasticsearch for logging on VMs (although I'm not
> > using Graylog) and my "wall" was on inserts, and it was heavily
> > influenced by the storage speed. You kind of need that, since I can
> > bet your index size is much bigger than the total size of RAM.
>
> > Some of the stuff you can do is to increase the number of shards if
> > you have no replicas configured (although this implies recreating the
> > index AFAIK). And we've also did here the following:
> > - increasted the refresh interval to 3
> > - compressed _source
> > - disabled _all (you need to check first if Graylog uses _all for
> > searching :D)
>
> > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
> > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > activity.
> > > As I believe that the cure could be tuning ES or  adding hardware
> > > resources, I'm posting to this list.
>
> > > The indexing capacity I would like is:
>
> > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > filter
> > > * Message load exceeding 200 messages/second
> > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > billion .
> > > * I currently have some 800 million mesages on disk, using about 5x140
> > > gigs of disk.
>
> > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > environment.
>
> > > I've applied the following changes to the bin/elasticsearch script:
>
> > > ulimit -n 60000 # fs.file-max = 131000
> > > ulimit -l unlimited
> > > export ES_HEAP_SIZE=16g
> > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > and disabled the swap area to prevent swapping.
>
> > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > and all CPU resources.
> > > The first VM, is not significantly loaded.
>
> > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > rather  change the ES configuration?
>
> > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

Radu Gheorghe
In reply to this post by MichaelGlad
Hi Michael,

Nice to hear things got better :)

Here are some other stuff you might want to try:
- increase the number of replicas (searches are done on replicas as
well, so it might help)
- at off-peak times, optimize your index. This should really help:
http://www.elasticsearch.org/guide/reference/api/admin-indices-optimize.html
- try to store shards on the server that has Graylog as well.

On Mar 27, 9:42 pm, MichaelGlad <[hidden email]> wrote:

> Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> messages / sec without piling up
> in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> cores ) and very
> slow query response (~40-50 seconds). The storage is almost idle.
>
> I've now per your suggestions increased refresh interval to 10 secs
> and enabled compression.
> I'll have to consult the GL2 sources to see if it uses _all. I've also
> increased the max Java heap size to 24 out
> of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> time starting up, search performance
> is now acceptable (10 secs) and the two ES data nodes only use some
> 10-20% CPU each.
>
> So it seems I'm back to a useful state of the world. Faster query time
> would be nice though.
>
>  - Michael
>
> On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
>
>
>
>
>
>
>
> > Hi Michael,
>
> > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > I'm also using Elasticsearch for logging on VMs (although I'm not
> > using Graylog) and my "wall" was on inserts, and it was heavily
> > influenced by the storage speed. You kind of need that, since I can
> > bet your index size is much bigger than the total size of RAM.
>
> > Some of the stuff you can do is to increase the number of shards if
> > you have no replicas configured (although this implies recreating the
> > index AFAIK). And we've also did here the following:
> > - increasted the refresh interval to 3
> > - compressed _source
> > - disabled _all (you need to check first if Graylog uses _all for
> > searching :D)
>
> > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > activity.
> > > As I believe that the cure could be tuning ES or  adding hardware
> > > resources, I'm posting to this list.
>
> > > The indexing capacity I would like is:
>
> > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > filter
> > > * Message load exceeding 200 messages/second
> > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > billion .
> > > * I currently have some 800 million mesages on disk, using about 5x140
> > > gigs of disk.
>
> > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > environment.
>
> > > I've applied the following changes to the bin/elasticsearch script:
>
> > > ulimit -n 60000 # fs.file-max = 131000
> > > ulimit -l unlimited
> > > export ES_HEAP_SIZE=16g
> > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > and disabled the swap area to prevent swapping.
>
> > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > and all CPU resources.
> > > The first VM, is not significantly loaded.
>
> > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > rather  change the ES configuration?
>
> > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

kimchy
Administrator
In reply to this post by MichaelGlad
What type of searches were being executed? It certainly might be that the ES process needed more memory to accommodate those (especially if sorting / facets are being used).

On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
Hi Raud, inserts are fast enough to keep up with the ~200 incoming
messages / sec without piling up
in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
cores ) and very
slow query response (~40-50 seconds). The storage is almost idle.

I've now per your suggestions increased refresh interval to 10 secs
and enabled compression.
I'll have to consult the GL2 sources to see if it uses _all. I've also
increased the max Java heap size to 24 out
of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
time starting up, search performance
is now acceptable (10 secs) and the two ES data nodes only use some
10-20% CPU each.

So it seems I'm back to a useful state of the world. Faster query time
would be nice though.

 - Michael

On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> Hi Michael,
>
> So what exactly happened? Inserts work slowly, or queries? Or both?
>
> I'm also using Elasticsearch for logging on VMs (although I'm not
> using Graylog) and my "wall" was on inserts, and it was heavily
> influenced by the storage speed. You kind of need that, since I can
> bet your index size is much bigger than the total size of RAM.
>
> Some of the stuff you can do is to increase the number of shards if
> you have no replicas configured (although this implies recreating the
> index AFAIK). And we've also did here the following:
> - increasted the refresh interval to 3
> - compressed _source
> - disabled _all (you need to check first if Graylog uses _all for
> searching :D)
>
> On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
>
>
>
>
>
>
> > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > activity.
> > As I believe that the cure could be tuning ES or  adding hardware
> > resources, I'm posting to this list.
>
> > The indexing capacity I would like is:
>
> > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > filter
> > * Message load exceeding 200 messages/second
> > * Capacity for storing 60 days worth of log messages, that's about 1
> > billion .
> > * I currently have some 800 million mesages on disk, using about 5x140
> > gigs of disk.
>
> > My setup is ( I'm using 64 bit Redhat Linux)
>
> > * One VM running GrayLog2 server + a ES server with no local shards.
> > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > environment.
>
> > I've applied the following changes to the bin/elasticsearch script:
>
> > ulimit -n 60000 # fs.file-max = 131000
> > ulimit -l unlimited
> > export ES_HEAP_SIZE=16g
> > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > and disabled the swap area to prevent swapping.
>
> > Java on the two VMs containing shards, use about 12 gigs out memory
> > and all CPU resources.
> > The first VM, is not significantly loaded.
>
> > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > rather  change the ES configuration?
>
> > Regards, Michael

Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

MichaelGlad
In reply to this post by Radu Gheorghe
Hi thank you for the suggestions, could it be the case the the extreme
CPU usage I'm seeing a couple of hours following each start-up  is ES
doing optimization
by its own initative? My shards are 140gigs each and as I understand
that each shard is a lucene index
this might be resource intensive.

  - Michael

On 28 Mar., 07:46, Radu Gheorghe <[hidden email]> wrote:

> Hi Michael,
>
> Nice to hear things got better :)
>
> Here are some other stuff you might want to try:
> - increase the number of replicas (searches are done on replicas as
> well, so it might help)
> - at off-peak times, optimize your index. This should really help:http://www.elasticsearch.org/guide/reference/api/admin-indices-optimi...
> - try to store shards on the server that has Graylog as well.
>
> On Mar 27, 9:42 pm, MichaelGlad <[hidden email]> wrote:
>
>
>
>
>
>
>
> > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > messages / sec without piling up
> > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > cores ) and very
> > slow query response (~40-50 seconds). The storage is almost idle.
>
> > I've now per your suggestions increased refresh interval to 10 secs
> > and enabled compression.
> > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > increased the max Java heap size to 24 out
> > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > time starting up, search performance
> > is now acceptable (10 secs) and the two ES data nodes only use some
> > 10-20% CPU each.
>
> > So it seems I'm back to a useful state of the world. Faster query time
> > would be nice though.
>
> >  - Michael
>
> > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
>
> > > Hi Michael,
>
> > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > influenced by the storage speed. You kind of need that, since I can
> > > bet your index size is much bigger than the total size of RAM.
>
> > > Some of the stuff you can do is to increase the number of shards if
> > > you have no replicas configured (although this implies recreating the
> > > index AFAIK). And we've also did here the following:
> > > - increasted the refresh interval to 3
> > > - compressed _source
> > > - disabled _all (you need to check first if Graylog uses _all for
> > > searching :D)
>
> > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > > activity.
> > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > resources, I'm posting to this list.
>
> > > > The indexing capacity I would like is:
>
> > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > filter
> > > > * Message load exceeding 200 messages/second
> > > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > > billion .
> > > > * I currently have some 800 million mesages on disk, using about 5x140
> > > > gigs of disk.
>
> > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > > environment.
>
> > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > ulimit -n 60000 # fs.file-max = 131000
> > > > ulimit -l unlimited
> > > > export ES_HEAP_SIZE=16g
> > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > and disabled the swap area to prevent swapping.
>
> > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > and all CPU resources.
> > > > The first VM, is not significantly loaded.
>
> > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > rather  change the ES configuration?
>
> > > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

MichaelGlad
In reply to this post by kimchy
It is relatively simple queries like

 gotit AND glad AND viagra

to see if the filter has caught any viagra spammails bound for me.


On 28 Mar., 12:55, Shay Banon <[hidden email]> wrote:

> What type of searches were being executed? It certainly might be that the
> ES process needed more memory to accommodate those (especially if sorting /
> facets are being used).
>
>
>
>
>
>
>
> On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > messages / sec without piling up
> > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > cores ) and very
> > slow query response (~40-50 seconds). The storage is almost idle.
>
> > I've now per your suggestions increased refresh interval to 10 secs
> > and enabled compression.
> > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > increased the max Java heap size to 24 out
> > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > time starting up, search performance
> > is now acceptable (10 secs) and the two ES data nodes only use some
> > 10-20% CPU each.
>
> > So it seems I'm back to a useful state of the world. Faster query time
> > would be nice though.
>
> >  - Michael
>
> > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > Hi Michael,
>
> > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > influenced by the storage speed. You kind of need that, since I can
> > > bet your index size is much bigger than the total size of RAM.
>
> > > Some of the stuff you can do is to increase the number of shards if
> > > you have no replicas configured (although this implies recreating the
> > > index AFAIK). And we've also did here the following:
> > > - increasted the refresh interval to 3
> > > - compressed _source
> > > - disabled _all (you need to check first if Graylog uses _all for
> > > searching :D)
>
> > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > > activity.
> > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > resources, I'm posting to this list.
>
> > > > The indexing capacity I would like is:
>
> > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > filter
> > > > * Message load exceeding 200 messages/second
> > > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > > billion .
> > > > * I currently have some 800 million mesages on disk, using about 5x140
> > > > gigs of disk.
>
> > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > > environment.
>
> > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > ulimit -n 60000 # fs.file-max = 131000
> > > > ulimit -l unlimited
> > > > export ES_HEAP_SIZE=16g
> > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > and disabled the swap area to prevent swapping.
>
> > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > and all CPU resources.
> > > > The first VM, is not significantly loaded.
>
> > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > rather  change the ES configuration?
>
> > > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

Radu Gheorghe
In reply to this post by kimchy
AFAIK Graylog does a "match_all" and sorts by date each time you open
the interface, and then at a certain interval. I guess that's expected
for any "logging" solution.

How would that impact the memory requirements? I mean, I would expect
to need more memory, but by how much?

On Mar 28, 1:55 pm, Shay Banon <[hidden email]> wrote:

> What type of searches were being executed? It certainly might be that the
> ES process needed more memory to accommodate those (especially if sorting /
> facets are being used).
>
>
>
>
>
>
>
> On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > messages / sec without piling up
> > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > cores ) and very
> > slow query response (~40-50 seconds). The storage is almost idle.
>
> > I've now per your suggestions increased refresh interval to 10 secs
> > and enabled compression.
> > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > increased the max Java heap size to 24 out
> > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > time starting up, search performance
> > is now acceptable (10 secs) and the two ES data nodes only use some
> > 10-20% CPU each.
>
> > So it seems I'm back to a useful state of the world. Faster query time
> > would be nice though.
>
> >  - Michael
>
> > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > Hi Michael,
>
> > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > influenced by the storage speed. You kind of need that, since I can
> > > bet your index size is much bigger than the total size of RAM.
>
> > > Some of the stuff you can do is to increase the number of shards if
> > > you have no replicas configured (although this implies recreating the
> > > index AFAIK). And we've also did here the following:
> > > - increasted the refresh interval to 3
> > > - compressed _source
> > > - disabled _all (you need to check first if Graylog uses _all for
> > > searching :D)
>
> > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > > activity.
> > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > resources, I'm posting to this list.
>
> > > > The indexing capacity I would like is:
>
> > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > filter
> > > > * Message load exceeding 200 messages/second
> > > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > > billion .
> > > > * I currently have some 800 million mesages on disk, using about 5x140
> > > > gigs of disk.
>
> > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > > environment.
>
> > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > ulimit -n 60000 # fs.file-max = 131000
> > > > ulimit -l unlimited
> > > > export ES_HEAP_SIZE=16g
> > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > and disabled the swap area to prevent swapping.
>
> > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > and all CPU resources.
> > > > The first VM, is not significantly loaded.
>
> > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > rather  change the ES configuration?
>
> > > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

kimchy
Administrator
Hard to say "how much", basically, when sorting, teh value for the field are loaded to memory to do it. You can see the current usage of it under the node stats API, under field cache.

On Thu, Mar 29, 2012 at 8:25 AM, Radu Gheorghe <[hidden email]> wrote:
AFAIK Graylog does a "match_all" and sorts by date each time you open
the interface, and then at a certain interval. I guess that's expected
for any "logging" solution.

How would that impact the memory requirements? I mean, I would expect
to need more memory, but by how much?

On Mar 28, 1:55 pm, Shay Banon <[hidden email]> wrote:
> What type of searches were being executed? It certainly might be that the
> ES process needed more memory to accommodate those (especially if sorting /
> facets are being used).
>
>
>
>
>
>
>
> On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > messages / sec without piling up
> > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > cores ) and very
> > slow query response (~40-50 seconds). The storage is almost idle.
>
> > I've now per your suggestions increased refresh interval to 10 secs
> > and enabled compression.
> > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > increased the max Java heap size to 24 out
> > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > time starting up, search performance
> > is now acceptable (10 secs) and the two ES data nodes only use some
> > 10-20% CPU each.
>
> > So it seems I'm back to a useful state of the world. Faster query time
> > would be nice though.
>
> >  - Michael
>
> > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > Hi Michael,
>
> > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > influenced by the storage speed. You kind of need that, since I can
> > > bet your index size is much bigger than the total size of RAM.
>
> > > Some of the stuff you can do is to increase the number of shards if
> > > you have no replicas configured (although this implies recreating the
> > > index AFAIK). And we've also did here the following:
> > > - increasted the refresh interval to 3
> > > - compressed _source
> > > - disabled _all (you need to check first if Graylog uses _all for
> > > searching :D)
>
> > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > > activity.
> > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > resources, I'm posting to this list.
>
> > > > The indexing capacity I would like is:
>
> > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > filter
> > > > * Message load exceeding 200 messages/second
> > > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > > billion .
> > > > * I currently have some 800 million mesages on disk, using about 5x140
> > > > gigs of disk.
>
> > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > > environment.
>
> > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > ulimit -n 60000 # fs.file-max = 131000
> > > > ulimit -l unlimited
> > > > export ES_HEAP_SIZE=16g
> > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > and disabled the swap area to prevent swapping.
>
> > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > and all CPU resources.
> > > > The first VM, is not significantly loaded.
>
> > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > rather  change the ES configuration?
>
> > > > Regards, Michael

Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

Radu Gheorghe
I see. Thanks a lot!

On Mar 29, 5:15 pm, Shay Banon <[hidden email]> wrote:

> Hard to say "how much", basically, when sorting, teh value for the field
> are loaded to memory to do it. You can see the current usage of it under
> the node stats API, under field cache.
>
> On Thu, Mar 29, 2012 at 8:25 AM, Radu Gheorghe <[hidden email]>wrote:
>
>
>
>
>
>
>
> > AFAIK Graylog does a "match_all" and sorts by date each time you open
> > the interface, and then at a certain interval. I guess that's expected
> > for any "logging" solution.
>
> > How would that impact the memory requirements? I mean, I would expect
> > to need more memory, but by how much?
>
> > On Mar 28, 1:55 pm, Shay Banon <[hidden email]> wrote:
> > > What type of searches were being executed? It certainly might be that the
> > > ES process needed more memory to accommodate those (especially if
> > sorting /
> > > facets are being used).
>
> > > On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > > > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > > > messages / sec without piling up
> > > > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > > > cores ) and very
> > > > slow query response (~40-50 seconds). The storage is almost idle.
>
> > > > I've now per your suggestions increased refresh interval to 10 secs
> > > > and enabled compression.
> > > > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > > > increased the max Java heap size to 24 out
> > > > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > > > time starting up, search performance
> > > > is now acceptable (10 secs) and the two ES data nodes only use some
> > > > 10-20% CPU each.
>
> > > > So it seems I'm back to a useful state of the world. Faster query time
> > > > would be nice though.
>
> > > >  - Michael
>
> > > > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > > > Hi Michael,
>
> > > > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > > > influenced by the storage speed. You kind of need that, since I can
> > > > > bet your index size is much bigger than the total size of RAM.
>
> > > > > Some of the stuff you can do is to increase the number of shards if
> > > > > you have no replicas configured (although this implies recreating the
> > > > > index AFAIK). And we've also did here the following:
> > > > > - increasted the refresh interval to 3
> > > > > - compressed _source
> > > > > - disabled _all (you need to check first if Graylog uses _all for
> > > > > searching :D)
>
> > > > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > > > seems I've hit a wall. Java is CPU bound and there's next to no
> > disk
> > > > > > activity.
> > > > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > > > resources, I'm posting to this list.
>
> > > > > > The indexing capacity I would like is:
>
> > > > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > > > filter
> > > > > > * Message load exceeding 200 messages/second
> > > > > > * Capacity for storing 60 days worth of log messages, that's about
> > 1
> > > > > > billion .
> > > > > > * I currently have some 800 million mesages on disk, using about
> > 5x140
> > > > > > gigs of disk.
>
> > > > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > > > * One VM running GrayLog2 server + a ES server with no local
> > shards.
> > > > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > > > The two VMs with shards have 32 gigs of memory and 4 cores in a
> > VMware
> > > > > > environment.
>
> > > > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > > > ulimit -n 60000 # fs.file-max = 131000
> > > > > > ulimit -l unlimited
> > > > > > export ES_HEAP_SIZE=16g
> > > > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > > > and disabled the swap area to prevent swapping.
>
> > > > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > > > and all CPU resources.
> > > > > > The first VM, is not significantly loaded.
>
> > > > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > > > rather  change the ES configuration?
>
> > > > > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

MichaelGlad
In reply to this post by MichaelGlad
Thank your for all your kind help. I've learned a lot about ES.
I've  located a reason why my searches are slow -- GL2 makes the
following Ruby call each time it returns
to the main screen - eg. after doing a search:

search("*", :size => 0).total

With close to a billion messages, that take some time :-)

I've now hacked the source, disabling the check, and now searches are
much faster. I'll contact the GL2
author and suggest adding a proper option to disable counting the
number of stored log entries
all the time.

 - Michael

On 28 Mar., 20:29, MichaelGlad <[hidden email]> wrote:

> It is relatively simple queries like
>
>  gotit AND glad AND viagra
>
> to see if the filter has caught any viagra spammails bound for me.
>
> On 28 Mar., 12:55, Shay Banon <[hidden email]> wrote:
>
>
>
>
>
>
>
> > What type of searches were being executed? It certainly might be that the
> > ES process needed more memory to accommodate those (especially if sorting /
> > facets are being used).
>
> > On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > > messages / sec without piling up
> > > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > > cores ) and very
> > > slow query response (~40-50 seconds). The storage is almost idle.
>
> > > I've now per your suggestions increased refresh interval to 10 secs
> > > and enabled compression.
> > > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > > increased the max Java heap size to 24 out
> > > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > > time starting up, search performance
> > > is now acceptable (10 secs) and the two ES data nodes only use some
> > > 10-20% CPU each.
>
> > > So it seems I'm back to a useful state of the world. Faster query time
> > > would be nice though.
>
> > >  - Michael
>
> > > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > > Hi Michael,
>
> > > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > > influenced by the storage speed. You kind of need that, since I can
> > > > bet your index size is much bigger than the total size of RAM.
>
> > > > Some of the stuff you can do is to increase the number of shards if
> > > > you have no replicas configured (although this implies recreating the
> > > > index AFAIK). And we've also did here the following:
> > > > - increasted the refresh interval to 3
> > > > - compressed _source
> > > > - disabled _all (you need to check first if Graylog uses _all for
> > > > searching :D)
>
> > > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > > > activity.
> > > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > > resources, I'm posting to this list.
>
> > > > > The indexing capacity I would like is:
>
> > > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > > filter
> > > > > * Message load exceeding 200 messages/second
> > > > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > > > billion .
> > > > > * I currently have some 800 million mesages on disk, using about 5x140
> > > > > gigs of disk.
>
> > > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > > > environment.
>
> > > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > > ulimit -n 60000 # fs.file-max = 131000
> > > > > ulimit -l unlimited
> > > > > export ES_HEAP_SIZE=16g
> > > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > > and disabled the swap area to prevent swapping.
>
> > > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > > and all CPU resources.
> > > > > The first VM, is not significantly loaded.
>
> > > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > > rather  change the ES configuration?
>
> > > > > Regards, Michael
Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

kimchy
Administrator
A faster way to get the total document is using the count API (or search with search_type set to count) with match_all query, yes an even faster way is possibly to use index stats API with docs: http://www.elasticsearch.org/guide/reference/api/admin-indices-stats.html and use the num docs on primary stats (thats assuming no filtering based on type is done).

On Sat, Mar 31, 2012 at 12:03 AM, MichaelGlad <[hidden email]> wrote:
Thank your for all your kind help. I've learned a lot about ES.
I've  located a reason why my searches are slow -- GL2 makes the
following Ruby call each time it returns
to the main screen - eg. after doing a search:

search("*", :size => 0).total

With close to a billion messages, that take some time :-)

I've now hacked the source, disabling the check, and now searches are
much faster. I'll contact the GL2
author and suggest adding a proper option to disable counting the
number of stored log entries
all the time.

 - Michael

On 28 Mar., 20:29, MichaelGlad <[hidden email]> wrote:
> It is relatively simple queries like
>
>  gotit AND glad AND viagra
>
> to see if the filter has caught any viagra spammails bound for me.
>
> On 28 Mar., 12:55, Shay Banon <[hidden email]> wrote:
>
>
>
>
>
>
>
> > What type of searches were being executed? It certainly might be that the
> > ES process needed more memory to accommodate those (especially if sorting /
> > facets are being used).
>
> > On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > > messages / sec without piling up
> > > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > > cores ) and very
> > > slow query response (~40-50 seconds). The storage is almost idle.
>
> > > I've now per your suggestions increased refresh interval to 10 secs
> > > and enabled compression.
> > > I'll have to consult the GL2 sources to see if it uses _all. I've also
> > > increased the max Java heap size to 24 out
> > > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of CPU
> > > time starting up, search performance
> > > is now acceptable (10 secs) and the two ES data nodes only use some
> > > 10-20% CPU each.
>
> > > So it seems I'm back to a useful state of the world. Faster query time
> > > would be nice though.
>
> > >  - Michael
>
> > > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > > Hi Michael,
>
> > > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > > influenced by the storage speed. You kind of need that, since I can
> > > > bet your index size is much bigger than the total size of RAM.
>
> > > > Some of the stuff you can do is to increase the number of shards if
> > > > you have no replicas configured (although this implies recreating the
> > > > index AFAIK). And we've also did here the following:
> > > > - increasted the refresh interval to 3
> > > > - compressed _source
> > > > - disabled _all (you need to check first if Graylog uses _all for
> > > > searching :D)
>
> > > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation. It
> > > > > seems I've hit a wall. Java is CPU bound and there's next to no disk
> > > > > activity.
> > > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > > resources, I'm posting to this list.
>
> > > > > The indexing capacity I would like is:
>
> > > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > > filter
> > > > > * Message load exceeding 200 messages/second
> > > > > * Capacity for storing 60 days worth of log messages, that's about 1
> > > > > billion .
> > > > > * I currently have some 800 million mesages on disk, using about 5x140
> > > > > gigs of disk.
>
> > > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > > * One VM running GrayLog2 server + a ES server with no local shards.
> > > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on each.
>
> > > > > The two VMs with shards have 32 gigs of memory and 4 cores in a VMware
> > > > > environment.
>
> > > > > I've applied the following changes to the bin/elasticsearch script:
>
> > > > > ulimit -n 60000 # fs.file-max = 131000
> > > > > ulimit -l unlimited
> > > > > export ES_HEAP_SIZE=16g
> > > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > > and disabled the swap area to prevent swapping.
>
> > > > > Java on the two VMs containing shards, use about 12 gigs out memory
> > > > > and all CPU resources.
> > > > > The first VM, is not significantly loaded.
>
> > > > > What could a solution be -- "Kiwi" (kill it with Iron) or should I
> > > > > rather  change the ES configuration?
>
> > > > > Regards, Michael

Reply | Threaded
Open this post in threaded view
|

Re: Problems with GrayLog2 + ES setup [long]

Radu Gheorghe
We have a slightly different approach. Please let me know if you see
something wrong with it.

Currently, we are using the "hits" return value when we are doing any
search. So when you go to the main page, we have to do a "match_all"
query sorted by date to give the last logs. The "hits" to that is the
total number of logs.

Then, when we're doing a specific search, we still get a number of
hits and we show that number. Even though the interface itself will
only show a limited number of results.

On 1 apr., 00:09, Shay Banon <[hidden email]> wrote:

> A faster way to get the total document is using the count API (or search
> with search_type set to count) with match_all query, yes an even faster way
> is possibly to use index stats API with docs:http://www.elasticsearch.org/guide/reference/api/admin-indices-stats....
> use the num docs on primary stats (thats assuming no filtering based
> on
> type is done).
>
>
>
> On Sat, Mar 31, 2012 at 12:03 AM, MichaelGlad <[hidden email]> wrote:
> > Thank your for all your kind help. I've learned a lot about ES.
> > I've  located a reason why my searches are slow -- GL2 makes the
> > following Ruby call each time it returns
> > to the main screen - eg. after doing a search:
>
> > search("*", :size => 0).total
>
> > With close to a billion messages, that take some time :-)
>
> > I've now hacked the source, disabling the check, and now searches are
> > much faster. I'll contact the GL2
> > author and suggest adding a proper option to disable counting the
> > number of stored log entries
> > all the time.
>
> >  - Michael
>
> > On 28 Mar., 20:29, MichaelGlad <[hidden email]> wrote:
> > > It is relatively simple queries like
>
> > >  gotit AND glad AND viagra
>
> > > to see if the filter has caught any viagra spammails bound for me.
>
> > > On 28 Mar., 12:55, Shay Banon <[hidden email]> wrote:
>
> > > > What type of searches were being executed? It certainly might be that
> > the
> > > > ES process needed more memory to accommodate those (especially if
> > sorting /
> > > > facets are being used).
>
> > > > On Tue, Mar 27, 2012 at 8:42 PM, MichaelGlad <[hidden email]> wrote:
> > > > > Hi Raud, inserts are fast enough to keep up with the ~200 incoming
> > > > > messages / sec without piling up
> > > > > in the Graylog2 queue. The problem are the serious CPU usage ( 2 x 3
> > > > > cores ) and very
> > > > > slow query response (~40-50 seconds). The storage is almost idle.
>
> > > > > I've now per your suggestions increased refresh interval to 10 secs
> > > > > and enabled compression.
> > > > > I'll have to consult the GL2 sources to see if it uses _all. I've
> > also
> > > > > increased the max Java heap size to 24 out
> > > > > of the 32G RAM and restarted ES. After having spent 2 x 7 hours of
> > CPU
> > > > > time starting up, search performance
> > > > > is now acceptable (10 secs) and the two ES data nodes only use some
> > > > > 10-20% CPU each.
>
> > > > > So it seems I'm back to a useful state of the world. Faster query
> > time
> > > > > would be nice though.
>
> > > > >  - Michael
>
> > > > > On 27 Mar., 11:52, Radu Gheorghe <[hidden email]> wrote:
> > > > > > Hi Michael,
>
> > > > > > So what exactly happened? Inserts work slowly, or queries? Or both?
>
> > > > > > I'm also using Elasticsearch for logging on VMs (although I'm not
> > > > > > using Graylog) and my "wall" was on inserts, and it was heavily
> > > > > > influenced by the storage speed. You kind of need that, since I can
> > > > > > bet your index size is much bigger than the total size of RAM.
>
> > > > > > Some of the stuff you can do is to increase the number of shards if
> > > > > > you have no replicas configured (although this implies recreating
> > the
> > > > > > index AFAIK). And we've also did here the following:
> > > > > > - increasted the refresh interval to 3
> > > > > > - compressed _source
> > > > > > - disabled _all (you need to check first if Graylog uses _all for
> > > > > > searching :D)
>
> > > > > > On Mar 26, 11:40 pm, MichaelGlad <[hidden email]> wrote:
>
> > > > > > > I'm having troubles with a my GrayLog2 + ES 0.19.0 installation.
> > It
> > > > > > > seems I've hit a wall. Java is CPU bound and there's next to no
> > disk
> > > > > > > activity.
> > > > > > > As I believe that the cure could be tuning ES or  adding hardware
> > > > > > > resources, I'm posting to this list.
>
> > > > > > > The indexing capacity I would like is:
>
> > > > > > > * Documents: long syslog mesages (160 chars avg) from a busy mail
> > > > > > > filter
> > > > > > > * Message load exceeding 200 messages/second
> > > > > > > * Capacity for storing 60 days worth of log messages, that's
> > about 1
> > > > > > > billion .
> > > > > > > * I currently have some 800 million mesages on disk, using about
> > 5x140
> > > > > > > gigs of disk.
>
> > > > > > > My setup is ( I'm using 64 bit Redhat Linux)
>
> > > > > > > * One VM running GrayLog2 server + a ES server with no local
> > shards.
> > > > > > > * Two VMs with a total of 5 shards distributed with 3 / 2 on
> > each.
>
> > > > > > > The two VMs with shards have 32 gigs of memory and 4 cores in a
> > VMware
> > > > > > > environment.
>
> > > > > > > I've applied the following changes to the bin/elasticsearch
> > script:
>
> > > > > > > ulimit -n 60000 # fs.file-max = 131000
> > > > > > > ulimit -l unlimited
> > > > > > > export ES_HEAP_SIZE=16g
> > > > > > > export JAVA_HOME=/usr/java/latest # SUN JRE 1.6.31
> > > > > > > export JAVA_OPTS="-Xloggc:/tmp/gc"
>
> > > > > > > and disabled the swap area to prevent swapping.
>
> > > > > > > Java on the two VMs containing shards, use about 12 gigs out
> > memory
> > > > > > > and all CPU resources.
> > > > > > > The first VM, is not significantly loaded.
>
> > > > > > > What could a solution be -- "Kiwi" (kill it with Iron) or should
> > I
> > > > > > > rather  change the ES configuration?
>
> > > > > > > Regards, Michael