stuck thread problem?

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

stuck thread problem?

Patrick Proniewski
Hello,

I'm running an ELK install for few months now, and few weeks ago I've noticed a strange behavior: ES had some kind of stuck thread consuming 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted ES, and it all came back to normal, until it started again 2 weeks later to consume CPU doing nothing. New restart, and 2 weeks later same problem.

I'm running:

ES 1.1.0 on FreeBSD 9.2-RELEASE amd64.

/usr/local/openjdk7/bin/java -Des.pidfile=/var/run/elasticsearch.pid -Djava.net.preferIPv4Stack=true -server -Xms1g -Xmx2g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.config=/usr/local/etc/elasticsearch/elasticsearch.yml -cp /usr/local/lib/elasticsearch/elasticsearch-1.1.0.jar:/usr/local/lib/elasticsearch/*:/usr/local/lib/elasticsearch/sigar/* org.elasticsearch.bootstrap.Elasticsearch

# java -version
openjdk version "1.7.0_51"
OpenJDK Runtime Environment (build 1.7.0_51-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

Here is a sample top output showing the faulty thread:

last pid: 37009;  load averages:  0.79,  0.73,  0.71               up 53+18:11:28  11:48:59
932 processes: 9 running, 901 sleeping, 22 waiting
CPU: 10.8% user,  0.0% nice,  0.2% system,  0.0% interrupt, 89.0% idle
Mem: 4139M Active, 1129M Inact, 8962M Wired, 1596M Free
ARC: 5116M Total, 2135M MFU, 2227M MRU, 12M Anon, 133M Header, 609M Other
Swap: 4096M Total, 4096M Free

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
       ../..
74417 elasticsearch  33    0  2791M  2058M uwait   5 153:33 23.97% java{java}    <--
74417 elasticsearch  27    0  2791M  2058M uwait   7  43:02  7.96% java{java}
74417 elasticsearch  27    0  2791M  2058M uwait   6  43:02  7.96% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   1   7:32  2.20% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   5   8:26  1.76% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   7   8:25  1.76% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   5   8:25  1.76% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   5   8:26  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   7   8:26  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   4   8:25  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   6   8:25  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   1   8:25  1.66% java{java}

Nothing to be found in  log files...

Any idea?

Patrick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7C4F5EC0-BA5D-4D8C-AB39-EEA4250956DC%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Mark Walkom
(You should really set Xms and Xmx to be the same.)

But it's not faulty, it's probably just GC which should be visible in the logs. How much data do you have in your "cluster"?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 25 August 2014 21:19, Patrick Proniewski <[hidden email]> wrote:
Hello,

I'm running an ELK install for few months now, and few weeks ago I've noticed a strange behavior: ES had some kind of stuck thread consuming 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted ES, and it all came back to normal, until it started again 2 weeks later to consume CPU doing nothing. New restart, and 2 weeks later same problem.

I'm running:

ES 1.1.0 on FreeBSD 9.2-RELEASE amd64.

/usr/local/openjdk7/bin/java -Des.pidfile=/var/run/elasticsearch.pid -Djava.net.preferIPv4Stack=true -server -Xms1g -Xmx2g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.config=/usr/local/etc/elasticsearch/elasticsearch.yml -cp /usr/local/lib/elasticsearch/elasticsearch-1.1.0.jar:/usr/local/lib/elasticsearch/*:/usr/local/lib/elasticsearch/sigar/* org.elasticsearch.bootstrap.Elasticsearch

# java -version
openjdk version "1.7.0_51"
OpenJDK Runtime Environment (build 1.7.0_51-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

Here is a sample top output showing the faulty thread:

last pid: 37009;  load averages:  0.79,  0.73,  0.71               up 53+18:11:28  11:48:59
932 processes: 9 running, 901 sleeping, 22 waiting
CPU: 10.8% user,  0.0% nice,  0.2% system,  0.0% interrupt, 89.0% idle
Mem: 4139M Active, 1129M Inact, 8962M Wired, 1596M Free
ARC: 5116M Total, 2135M MFU, 2227M MRU, 12M Anon, 133M Header, 609M Other
Swap: 4096M Total, 4096M Free

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
       ../..
74417 elasticsearch  33    0  2791M  2058M uwait   5 153:33 23.97% java{java}    <--
74417 elasticsearch  27    0  2791M  2058M uwait   7  43:02  7.96% java{java}
74417 elasticsearch  27    0  2791M  2058M uwait   6  43:02  7.96% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   1   7:32  2.20% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   5   8:26  1.76% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   7   8:25  1.76% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   5   8:25  1.76% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   5   8:26  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   7   8:26  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   4   8:25  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   6   8:25  1.66% java{java}
74417 elasticsearch  22    0  2791M  2058M uwait   1   8:25  1.66% java{java}

Nothing to be found in  log files...

Any idea?

Patrick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7C4F5EC0-BA5D-4D8C-AB39-EEA4250956DC%40patpro.net.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z2AiKZPs2EoQDSqAvYLVzAwJvwC%3DiFV4G4f6OGPcPOCA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Patrick Proniewski
On 25 août 2014, at 13:51, Mark Walkom <[hidden email]> wrote:

> (You should really set Xms and Xmx to be the same.)

Ok, I'll do this next time I restart.


> But it's not faulty, it's probably just GC which should be visible in the
> logs. How much data do you have in your "cluster"?


NAME                               USED  AVAIL  REFER  MOUNTPOINT
zdata/elasticsearch               4.07G  1.01T  4.07G  /zdata/elasticsearch

It's only daily indices (logstash)
Is GC supposed to last more than 3 days? And I can't find any reference to a garbage collecting in /var/log/elasticsearch/*.



> On 25 August 2014 21:19, Patrick Proniewski <[hidden email]>
> wrote:
>
>> Hello,
>>
>> I'm running an ELK install for few months now, and few weeks ago I've
>> noticed a strange behavior: ES had some kind of stuck thread consuming
>> 20-70% of a CPU core. It remained unnoticed for days. Then I've restarted
>> ES, and it all came back to normal, until it started again 2 weeks later to
>> consume CPU doing nothing. New restart, and 2 weeks later same problem.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6D4EDFB2-21CB-46DB-B765-CD6FC8C74D45%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
maf
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

maf
I see the same problem. We are running 1.1.1 on a 13-node cluster (3 master and 5+5 data). I see stuck threads on most of the data nodes, I had a look around on one of them. Top in thread mode shows:
top - 08:08:20 up 62 days, 18:49,  1 user,  load average: 9.18, 13.21, 12.67
Threads: 528 total,  14 running, 514 sleeping,   0 stopped,   0 zombie
%Cpu(s): 39.0 us,  1.5 sy,  0.0 ni, 59.0 id,  0.2 wa,  0.2 hi,  0.0 si,  0.1 st
KiB Mem:  62227892 total, 61933428 used,   294464 free,    65808 buffers
KiB Swap: 61865980 total,    19384 used, 61846596 free. 24645668 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                                   
 3743 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17462:00 java                                                                                                      
 3748 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17457:55 java                                                                                                      
 3761 elastic+  20   0  1.151t 0.045t 0.013t S 93.1 78.1  17455:21 java                                                                                                      
 3744 elastic+  20   0  1.151t 0.045t 0.013t S 92.7 78.1  17456:55 java                                                                                                      
 1758 elastic+  20   0  1.151t 0.045t 0.013t R  5.9 78.1   3450:01 java                                                                                                      
 1755 elastic+  20   0  1.151t 0.045t 0.013t R  5.6 78.1   3450:05 java                 

So I have four threads consuming way more CPU than the others. The node is only doing a moderate amount of garbage collection. Running jstack I find that all the stuck threads have  stack dump which looks like this:
Thread 3744: (state = IN_JAVA)
 - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446 (Compiled frame; information may be imprecise)
 - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled frame)
 - org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame)
 - org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame)
 - org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext, boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled frame)
 - org.apache.lucene.search.IndexSearcher.search(java.util.List, org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) @bci=68, line=618 (Compiled frame)
 - org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List, org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) @bci=225, line=173 (Compiled frame)
 - org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query, org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame)
 - org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext) @bci=54, line=52 (Interpreted frame)
 - org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext) @bci=174, line=119 (Compiled frame)
 - org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest) @bci=49, line=233 (Interpreted frame)
 - org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest, org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted frame)
 - org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest, org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted frame)
 - org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run() @bci=12, line=270 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)

The state varies between IN_JAVA an BLOCKED. I took two stack traces 10 minutes apart and they were identical for the suspect threads.

I assume this could be a very long running query, but I wonder if it isn't just stuck. Perhaps we are seeing this issue: http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/78b9300c-02c3-4aba-be0e-98d92b97ee7d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Mark Walkom
It'd be worth raising this as an issue on Github if you are concerned, at least then the ES devs will see it :)

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 27 August 2014 18:34, Martin Forssen <[hidden email]> wrote:
I see the same problem. We are running 1.1.1 on a 13-node cluster (3 master and 5+5 data). I see stuck threads on most of the data nodes, I had a look around on one of them. Top in thread mode shows:
top - 08:08:20 up 62 days, 18:49,  1 user,  load average: 9.18, 13.21, 12.67
Threads: 528 total,  14 running, 514 sleeping,   0 stopped,   0 zombie
%Cpu(s): 39.0 us,  1.5 sy,  0.0 ni, 59.0 id,  0.2 wa,  0.2 hi,  0.0 si,  0.1 st
KiB Mem:  62227892 total, 61933428 used,   294464 free,    65808 buffers
KiB Swap: 61865980 total,    19384 used, 61846596 free. 24645668 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND                                                                                                   
 3743 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17462:00 java                                                                                                      
 3748 elastic+  20   0  1.151t 0.045t 0.013t S 93.4 78.1  17457:55 java                                                                                                      
 3761 elastic+  20   0  1.151t 0.045t 0.013t S 93.1 78.1  17455:21 java                                                                                                      
 3744 elastic+  20   0  1.151t 0.045t 0.013t S 92.7 78.1  17456:55 java                                                                                                      
 1758 elastic+  20   0  1.151t 0.045t 0.013t R  5.9 78.1   3450:01 java                                                                                                      
 1755 elastic+  20   0  1.151t 0.045t 0.013t R  5.6 78.1   3450:05 java                 

So I have four threads consuming way more CPU than the others. The node is only doing a moderate amount of garbage collection. Running jstack I find that all the stuck threads have  stack dump which looks like this:
Thread 3744: (state = IN_JAVA)
 - java.util.HashMap.getEntry(java.lang.Object) @bci=72, line=446 (Compiled frame; information may be imprecise)
 - java.util.HashMap.get(java.lang.Object) @bci=11, line=405 (Compiled frame)
 - org.elasticsearch.search.scan.ScanContext$ScanFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, org.apache.lucene.util.Bits) @bci=8, line=156 (Compiled frame)
 - org.elasticsearch.common.lucene.search.ApplyAcceptedDocsFilter.getDocIdSet(org.apache.lucene.index.AtomicReaderContext, org.apache.lucene.util.Bits) @bci=6, line=45 (Compiled frame)
 - org.apache.lucene.search.FilteredQuery$1.scorer(org.apache.lucene.index.AtomicReaderContext, boolean, boolean, org.apache.lucene.util.Bits) @bci=34, line=130 (Compiled frame)
 - org.apache.lucene.search.IndexSearcher.search(java.util.List, org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) @bci=68, line=618 (Compiled frame)
 - org.elasticsearch.search.internal.ContextIndexSearcher.search(java.util.List, org.apache.lucene.search.Weight, org.apache.lucene.search.Collector) @bci=225, line=173 (Compiled frame)
 - org.apache.lucene.search.IndexSearcher.search(org.apache.lucene.search.Query, org.apache.lucene.search.Collector) @bci=11, line=309 (Interpreted frame)
 - org.elasticsearch.search.scan.ScanContext.execute(org.elasticsearch.search.internal.SearchContext) @bci=54, line=52 (Interpreted frame)
 - org.elasticsearch.search.query.QueryPhase.execute(org.elasticsearch.search.internal.SearchContext) @bci=174, line=119 (Compiled frame)
 - org.elasticsearch.search.SearchService.executeScan(org.elasticsearch.search.internal.InternalScrollSearchRequest) @bci=49, line=233 (Interpreted frame)
 - org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.search.internal.InternalScrollSearchRequest, org.elasticsearch.transport.TransportChannel) @bci=8, line=791 (Interpreted frame)
 - org.elasticsearch.search.action.SearchServiceTransportAction$SearchScanScrollTransportHandler.messageReceived(org.elasticsearch.transport.TransportRequest, org.elasticsearch.transport.TransportChannel) @bci=6, line=780 (Interpreted frame)
 - org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler.run() @bci=12, line=270 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame)
 - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)

The state varies between IN_JAVA an BLOCKED. I took two stack traces 10 minutes apart and they were identical for the suspect threads.

I assume this could be a very long running query, but I wonder if it isn't just stuck. Perhaps we are seeing this issue: http://stackoverflow.com/questions/17070184/hashmap-stuck-on-get

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/78b9300c-02c3-4aba-be0e-98d92b97ee7d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YxRt0Sh8f8O0dvfKoWPz4RsDTSk-WnkR7XAdgo-%2BzGSg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
maf
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

maf
I did report it https://github.com/elasticsearch/elasticsearch/issues/7478

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aea84561-33fc-4451-a5e5-bbc576470135%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
maf
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

maf
FYI, this turned out to be a real bug. A fix has been committed and will be included in the next release.

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
I did report it <a href="https://github.com/elasticsearch/elasticsearch/issues/7478" target="_blank" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F7478\46sa\75D\46sntz\0751\46usg\75AFQjCNEuHg2gWMTWT4hfZ7bBSYxC3WQ2Qw';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F7478\46sa\75D\46sntz\0751\46usg\75AFQjCNEuHg2gWMTWT4hfZ7bBSYxC3WQ2Qw';return true;">https://github.com/elasticsearch/elasticsearch/issues/7478

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/04d9c094-112d-4d7d-bd48-e4fa2ff3a774%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Patrick Proniewski
Thank you!

On 29 août 2014, at 08:49, Martin Forssen <[hidden email]> wrote:

> FYI, this turned out to be a real bug. A fix has been committed and will be
> included in the next release.
>
> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
>>
>> I did report it https://github.com/elasticsearch/elasticsearch/issues/7478
>>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Martijn v Groningen
Hi Patrick,

Did you see the same stuck thread via jstack or the hot thread api that Martin reported? This can only happen if scan search was enabled (by setting search_type=scan in a search request)
If that isn't the case then something else is maybe stuck.

Martijn


On 29 August 2014 09:58, Patrick Proniewski <[hidden email]> wrote:
Thank you!

On 29 août 2014, at 08:49, Martin Forssen <[hidden email]> wrote:

> FYI, this turned out to be a real bug. A fix has been committed and will be
> included in the next release.
>
> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
>>
>> I did report it https://github.com/elasticsearch/elasticsearch/issues/7478
>>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net.
For more options, visit https://groups.google.com/d/optout.



--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Patrick Proniewski
Hi,

I don't know how to debug a JAVA process. Haven't heard about jstack until it was mentioned in this thread.
All I know is what I've posted in my first message.

I've restarted ES, and currently I've no stuck thread to investigate. In the mean time, you can teach me how I should use jstack, so next time it happens I'll be ready.

On 29 août 2014, at 13:19, Martijn v Groningen <[hidden email]> wrote:

> Hi Patrick,
>
> Did you see the same stuck thread via jstack or the hot thread api that
> Martin reported? This can only happen if scan search was enabled (by
> setting search_type=scan in a search request)
> If that isn't the case then something else is maybe stuck.
>
> Martijn
>
>
> On 29 August 2014 09:58, Patrick Proniewski <[hidden email]>
> wrote:
>
>> Thank you!
>>
>> On 29 août 2014, at 08:49, Martin Forssen <[hidden email]> wrote:
>>
>>> FYI, this turned out to be a real bug. A fix has been committed and will
>> be
>>> included in the next release.
>>>
>>> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
>>>>
>>>> I did report it
>> https://github.com/elasticsearch/elasticsearch/issues/7478
>>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/34529AB7-AD03-404F-9787-60BD6B90E1A4%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Martijn v Groningen
Hi Patrick,

I this problem happens again then you should execute the hot threads api:
curl localhost:9200/_nodes/hot_threads


Just pick a node in your cluster and run that command. This is the equivalent of running jstack on all the nodes in your cluster.

Martijn


On 29 August 2014 13:34, Patrick Proniewski <[hidden email]> wrote:
Hi,

I don't know how to debug a JAVA process. Haven't heard about jstack until it was mentioned in this thread.
All I know is what I've posted in my first message.

I've restarted ES, and currently I've no stuck thread to investigate. In the mean time, you can teach me how I should use jstack, so next time it happens I'll be ready.

On 29 août 2014, at 13:19, Martijn v Groningen <[hidden email]> wrote:

> Hi Patrick,
>
> Did you see the same stuck thread via jstack or the hot thread api that
> Martin reported? This can only happen if scan search was enabled (by
> setting search_type=scan in a search request)
> If that isn't the case then something else is maybe stuck.
>
> Martijn
>
>
> On 29 August 2014 09:58, Patrick Proniewski <[hidden email]>
> wrote:
>
>> Thank you!
>>
>> On 29 août 2014, at 08:49, Martin Forssen <[hidden email]> wrote:
>>
>>> FYI, this turned out to be a real bug. A fix has been committed and will
>> be
>>> included in the next release.
>>>
>>> On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
>>>>
>>>> I did report it
>> https://github.com/elasticsearch/elasticsearch/issues/7478
>>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/DFFDD1A6-9F76-4AC0-8211-95C47CC5CAC7%40patpro.net
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Met vriendelijke groet,
>
> Martijn van Groningen
>
> --
> You received this message because you are subscribed to the Google Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Ty1RLHNButgkgYZ3pt_L0ygtonn7y8QpM%3D-0ttC%2BM84gQ%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/34529AB7-AD03-404F-9787-60BD6B90E1A4%40patpro.net.
For more options, visit https://groups.google.com/d/optout.



--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TxbSpXgVwRmfF5X3%2BDzsWgz8iNgaTjWXOP7iT1NdfHLow%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Patrick Proniewski
Hi,

Epic fail:

[2014-09-05 22:14:00,043][DEBUG][action.admin.cluster.node.hotthreads] [Bloodsport] failed to execute on node [fLwUGA_eSfmLApJ7SyIncA]
org.elasticsearch.ElasticsearchException: failed to detect hot threads
        at org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:103)
        at org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:43)
        at org.elasticsearch.action.support.nodes.TransportNodesOperationAction$AsyncAction$2.run(TransportNodesOperationAction.java:146)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.IllegalStateException: MBean doesn't support thread CPU Time
        at org.elasticsearch.monitor.jvm.HotThreads.innerDetect(HotThreads.java:90)
        at org.elasticsearch.monitor.jvm.HotThreads.detect(HotThreads.java:75)
        at org.elasticsearch.action.admin.cluster.node.hotthreads.TransportNodesHotThreadsAction.nodeOperation(TransportNodesHotThreadsAction.java:101)
        ... 5 more


On 29 août 2014, at 13:54, Martijn v Groningen wrote:

> Hi Patrick,
>
> I this problem happens again then you should execute the hot threads api:
> curl localhost:9200/_nodes/hot_threads
>
> Documentation: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html#cluster-nodes-hot-threads
>
> Just pick a node in your cluster and run that command. This is the equivalent of running jstack on all the nodes in your cluster.
>
> Martijn
>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/62B26081-90D7-437E-A797-A5437A6EF466%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Hieu Nguyen
In reply to this post by maf
We have seen a similar issue in our cluster (CPU usage and search time suddenly went up slowly for the master node over a period of one day, until we restarted). Is there a easy way to confirm that it's indeed the same issue mentioned here?

Below is the output of our hot threads on this node (version 0.90.12):

   85.8% (857.7ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#3]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   83.3% (832.6ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#49]'
     9/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     unique snapshot
       org.apache.lucene.search.DisjunctionMaxScorer.afterNext(DisjunctionMaxScorer.java:65)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:139)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:130)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:177)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   80.2% (802.2ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#41]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)



On Thursday, August 28, 2014 11:49:07 PM UTC-7, Martin Forssen wrote:
FYI, this turned out to be a real bug. A fix has been committed and will be included in the next release.There 

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
I did report it <a href="https://github.com/elasticsearch/elasticsearch/issues/7478" target="_blank" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F7478\46sa\75D\46sntz\0751\46usg\75AFQjCNEuHg2gWMTWT4hfZ7bBSYxC3WQ2Qw';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F7478\46sa\75D\46sntz\0751\46usg\75AFQjCNEuHg2gWMTWT4hfZ7bBSYxC3WQ2Qw';return true;">https://github.com/elasticsearch/elasticsearch/issues/7478

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Martijn v Groningen
Patrick: I have never seen this, but this means the openjdk on FreeBSD doesn't support cpu sampling of threads. Not sure, but maybe you can try with an Oracle jdk/jre? Otherwise jstack should be used on order to figure out on what threads of ES get stuck on.

Hieu: This doesn't look like the issue that is fixed, since the threads are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old version and I would recommend to upgrade. Also mvel is deprecated and will only be available via a plugin: https://github.com/elasticsearch/elasticsearch/pull/6571 and https://github.com/elasticsearch/elasticsearch/pull/6610


On 6 September 2014 16:12, Hieu Nguyen <[hidden email]> wrote:
We have seen a similar issue in our cluster (CPU usage and search time suddenly went up slowly for the master node over a period of one day, until we restarted). Is there a easy way to confirm that it's indeed the same issue mentioned here?

Below is the output of our hot threads on this node (version 0.90.12):

   85.8% (857.7ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#3]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   83.3% (832.6ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#49]'
     9/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     unique snapshot
       org.apache.lucene.search.DisjunctionMaxScorer.afterNext(DisjunctionMaxScorer.java:65)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:139)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:130)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:177)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   80.2% (802.2ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#41]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)



On Thursday, August 28, 2014 11:49:07 PM UTC-7, Martin Forssen wrote:
FYI, this turned out to be a real bug. A fix has been committed and will be included in the next release.There 

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76Tw1FRkGWCqh1RDxHQdX10Yh%3DoZbU17nxGyoacMPviDCdg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Patrick Proniewski
On 9 sept. 2014, at 09:03, Martijn v Groningen <[hidden email]> wrote:

> Patrick: I have never seen this, but this means the openjdk on FreeBSD
> doesn't support cpu sampling of threads. Not sure, but maybe you can try
> with an Oracle jdk/jre? Otherwise jstack should be used on order to figure
> out on what threads of ES get stuck on.


Martijn, thanks for coming back on this. Last week end I've performed a full upgrade of this server (freebsd upgrade from 9.2 to 9.3, and recompiliation of every ports installed). In the process, ES was upgraded, as was OpenJDK.

I'll wait and see if the problem comes back or not, and keep you posted.

regards,
Patrick

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E1C92FAF-0863-46F3-8EB9-6FE26278E111%40patpro.net.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Hieu Nguyen
In reply to this post by Martijn v Groningen
Thanks for the response, Martijn! We'll consider upgrading, but it'd be great to root cause the issue. 

On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote:
Patrick: I have never seen this, but this means the openjdk on FreeBSD doesn't support cpu sampling of threads. Not sure, but maybe you can try with an Oracle jdk/jre? Otherwise jstack should be used on order to figure out on what threads of ES get stuck on.

Hieu: This doesn't look like the issue that is fixed, since the threads are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old version and I would recommend to upgrade. Also mvel is deprecated and will only be available via a plugin: <a href="https://github.com/elasticsearch/elasticsearch/pull/6571" target="_blank" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fpull%2F6571\46sa\75D\46sntz\0751\46usg\75AFQjCNFGXVpxwzgp2HwFrBibZFUiimoR1Q';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fpull%2F6571\46sa\75D\46sntz\0751\46usg\75AFQjCNFGXVpxwzgp2HwFrBibZFUiimoR1Q';return true;">https://github.com/elasticsearch/elasticsearch/pull/6571 and <a href="https://github.com/elasticsearch/elasticsearch/pull/6610" target="_blank" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fpull%2F6610\46sa\75D\46sntz\0751\46usg\75AFQjCNGXoH9YPqtZIkAh9AmVGjf_wfK6gQ';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fpull%2F6610\46sa\75D\46sntz\0751\46usg\75AFQjCNGXoH9YPqtZIkAh9AmVGjf_wfK6gQ';return true;">https://github.com/elasticsearch/elasticsearch/pull/6610


On 6 September 2014 16:12, Hieu Nguyen <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="q7dahf8B1fIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">hi...@...> wrote:
We have seen a similar issue in our cluster (CPU usage and search time suddenly went up slowly for the master node over a period of one day, until we restarted). Is there a easy way to confirm that it's indeed the same issue mentioned here?

Below is the output of our hot threads on this node (version 0.90.12):

   85.8% (857.7ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#3]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   83.3% (832.6ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#49]'
     9/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     unique snapshot
       org.apache.lucene.search.DisjunctionMaxScorer.afterNext(DisjunctionMaxScorer.java:65)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:139)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:130)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:177)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   80.2% (802.2ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#41]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)



On Thursday, August 28, 2014 11:49:07 PM UTC-7, Martin Forssen wrote:
FYI, this turned out to be a real bug. A fix has been committed and will be included in the next release.There 

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:
I did report it <a href="https://github.com/elasticsearch/elasticsearch/issues/7478" target="_blank" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F7478\46sa\75D\46sntz\0751\46usg\75AFQjCNEuHg2gWMTWT4hfZ7bBSYxC3WQ2Qw';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fissues%2F7478\46sa\75D\46sntz\0751\46usg\75AFQjCNEuHg2gWMTWT4hfZ7bBSYxC3WQ2Qw';return true;">https://github.com/elasticsearch/elasticsearch/issues/7478

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="q7dahf8B1fIJ" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com.

For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.



--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/363e12fc-1c62-4bf7-8e73-ed300ff16e5a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: stuck thread problem?

Martijn v Groningen
Hieu: The stack trace you shared, is from a search request with a mvel script in a function_score query. I don't think that this is causing threads to get stuck and it just pops up here because in general a query with scripts take more cpu time. You can easily verify this if you stop search requests temporarily then the shared stacktrace shouldn't be included in the hot threads output. If a node gets slower over time there may be a different issue. For example the jvm is slowly running out of memory and garbage collections taking are longer.


On 10 September 2014 05:26, Hieu Nguyen <[hidden email]> wrote:
Thanks for the response, Martijn! We'll consider upgrading, but it'd be great to root cause the issue. 

On Tuesday, September 9, 2014 12:03:57 AM UTC-7, Martijn v Groningen wrote:
Patrick: I have never seen this, but this means the openjdk on FreeBSD doesn't support cpu sampling of threads. Not sure, but maybe you can try with an Oracle jdk/jre? Otherwise jstack should be used on order to figure out on what threads of ES get stuck on.

Hieu: This doesn't look like the issue that is fixed, since the threads are in mvel (scripting) code and not in scan/scroll code. 0.90.12 is an old version and I would recommend to upgrade. Also mvel is deprecated and will only be available via a plugin: https://github.com/elasticsearch/elasticsearch/pull/6571 and https://github.com/elasticsearch/elasticsearch/pull/6610


On 6 September 2014 16:12, Hieu Nguyen <[hidden email]> wrote:
We have seen a similar issue in our cluster (CPU usage and search time suddenly went up slowly for the master node over a period of one day, until we restarted). Is there a easy way to confirm that it's indeed the same issue mentioned here?

Below is the output of our hot threads on this node (version 0.90.12):

   85.8% (857.7ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#3]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.FilteredDocIdSet$1.get(FilteredDocIdSet.java:65)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:178)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   83.3% (832.6ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#49]'
     9/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     unique snapshot
       org.apache.lucene.search.DisjunctionMaxScorer.afterNext(DisjunctionMaxScorer.java:65)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:139)
       org.apache.lucene.search.DisjunctionScorer.nextDoc(DisjunctionScorer.java:130)
       org.apache.lucene.search.FilteredQuery$QueryFirstScorer.nextDoc(FilteredQuery.java:177)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.nextDoc(FunctionScoreQuery.java:169)
       org.apache.lucene.search.Scorer.score(Scorer.java:64)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)

   80.2% (802.2ms out of 1s) cpu usage by thread 'elasticsearch[cluster1][search][T#41]'
     8/10 snapshots sharing following 30 elements
       java.lang.ThreadLocal$ThreadLocalMap.set(ThreadLocal.java:429)
       java.lang.ThreadLocal$ThreadLocalMap.access$100(ThreadLocal.java:261)
       java.lang.ThreadLocal.set(ThreadLocal.java:183)
       org.elasticsearch.common.mvel2.optimizers.OptimizerFactory.clearThreadAccessorOptimizer(OptimizerFactory.java:114)
       org.elasticsearch.common.mvel2.MVELRuntime.execute(MVELRuntime.java:169)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getDirectValue(CompiledExpression.java:123)
       org.elasticsearch.common.mvel2.compiler.CompiledExpression.getValue(CompiledExpression.java:119)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.run(MvelScriptEngineService.java:191)
       org.elasticsearch.script.mvel.MvelScriptEngineService$MvelSearchScript.runAsDouble(MvelScriptEngineService.java:206)
       org.elasticsearch.common.lucene.search.function.ScriptScoreFunction.score(ScriptScoreFunction.java:54)
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)
     2/10 snapshots sharing following 20 elements
       org.elasticsearch.common.lucene.search.function.FunctionScoreQuery$CustomBoostFactorScorer.score(FunctionScoreQuery.java:175)
       org.apache.lucene.search.TopScoreDocCollector$OutOfOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:140)
       org.apache.lucene.search.TimeLimitingCollector.collect(TimeLimitingCollector.java:153)
       org.apache.lucene.search.Scorer.score(Scorer.java:65)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:162)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:117)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:244)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:202)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:744)



On Thursday, August 28, 2014 11:49:07 PM UTC-7, Martin Forssen wrote:
FYI, this turned out to be a real bug. A fix has been committed and will be included in the next release.There 

On Wednesday, August 27, 2014 11:36:03 AM UTC+2, Martin Forssen wrote:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c847c08-3850-4821-aedd-04259d0990a2%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Met vriendelijke groet,

Martijn van Groningen



--
Met vriendelijke groet,

Martijn van Groningen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BA76TypU1PJavcop_wfzLmimg3%3De%3D%3Dmma-rgRWwYrnXum2c4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.