does snapshot restore lead to a memory leak?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

does snapshot restore lead to a memory leak?

José de Zárate
We have one one-machine cluster with about 1k indices. It used to work flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's getting its memory exhausted within 7 days of use. The cluster make about 700 restore proceedings during the day. Maybe there are some memory considerations when using the restore feature???

--
uh, oh.


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKNaH0XTQtcSsXPBAb%2BbOh2Hcg-9QCBRd4hNjxN-N1UFLvENBw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

Igor Motov-3
Just to make sure I got it right, you really meant 700 restores (not just 700 snapshots), correct? What type of repository are you using? Could you add a bit more details about your use case?

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:
We have one one-machine cluster with about 1k indices. It used to work flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's getting its memory exhausted within 7 days of use. The cluster make about 700 restore proceedings during the day. Maybe there are some memory considerations when using the restore feature???

--
uh, <a href="http://www.youtube.com/watch?v=GMD_T7ICL0o" target="_blank" onmousedown="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;" onclick="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;">oh.

<a href="http://www.defectivebydesign.org/no-drm-in-html5" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;">

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6158eb50-bdbd-40c3-80fb-b18102cacb6d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

José de Zárate
Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the update.

I explain:

   - we have one cluster of one machine which is only meant for serving search requests. the goal is  not to index anything to it. It contains 1.7k indices, give it or take it. 
   - every day, those 1.7k indices are reindexed, and snapshoted in pairs to a S3 repository (producint 850 snapshots)repository. 
   - every day, the one "reading only" cluster of the first point restores those 850 snapshots to "update" its 1.7k indices from that same S3 repository. 

It works like a real charm. Load has dropped dramatically, and we can set a "farm" of temporary machines to do the indexing duties. 

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is nothing in the logs that shows any error, but after a week or a few days, the host has its memory almost exhausted and elasticsearch is not responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
        at org
.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
        at org
.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:725)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:704)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:671)
        at org
.elasticsearch.common.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
        at org
.elasticsearch.http.netty.NettyHttpChannel.sendResponse(NettyHttpChannel.java:158)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:106)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:98)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(TransportSearchQueryAndFetchAction.java:94)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(TransportSearchQueryAndFetchAction.java:77)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:425)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:243)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onResult(TransportSearchTypeAction.java:219)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$3.onResult(TransportSearchTypeAction.java:216)
        at org
.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:305)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryAndFetchAction.java:71)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
        at java
.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at java
.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java
.lang.Thread.run(Thread.java:701)




It looks like it shutdowns itself down  for some reason...

The hosts in which elasticsearch is living has nothing else installed besides a standard ubuntu distribution. It's completely devoted to elasticsearch.

the memory consumption grows a 10% in 36h



On Monday, June 30, 2014 10:45:18 PM UTC-4, Igor Motov wrote:
Just to make sure I got it right, you really meant 700 restores (not just 700 snapshots), correct? What type of repository are you using? Could you add a bit more details about your use case?

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:
We have one one-machine cluster with about 1k indices. It used to work flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's getting its memory exhausted within 7 days of use. The cluster make about 700 restore proceedings during the day. Maybe there are some memory considerations when using the restore feature???

--
uh, <a href="http://www.youtube.com/watch?v=GMD_T7ICL0o" target="_blank" onmousedown="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;" onclick="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;">oh.

<a href="http://www.defectivebydesign.org/no-drm-in-html5" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;">

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3efafd17-5a2b-4dc5-b935-4a0c5c1a8bab%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

Igor Motov-3
So, your "search-only" machines are running out of memory, while your "index-only" machines are doing fine. Did I understand you correctly? Could you send me nodes stats (curl "localhost:9200/_nodes/stats?pretty") from the machine that runs out of memory, please run stats a few times with 1 hour interval. I would like to see how memory consumption is increasing over time. Please, also run nodes info ones (curl "localhost:9200/_nodes") and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:
Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the update.

I explain:

   - we have one cluster of one machine which is only meant for serving search requests. the goal is  not to index anything to it. It contains 1.7k indices, give it or take it. 
   - every day, those 1.7k indices are reindexed, and snapshoted in pairs to a S3 repository (producint 850 snapshots)repository. 
   - every day, the one "reading only" cluster of the first point restores those 850 snapshots to "update" its 1.7k indices from that same S3 repository. 

It works like a real charm. Load has dropped dramatically, and we can set a "farm" of temporary machines to do the indexing duties. 

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is nothing in the logs that shows any error, but after a week or a few days, the host has its memory almost exhausted and elasticsearch is not responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
        at org
.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
        at org
.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:725)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:704)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:671)
        at org
.elasticsearch.common.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
        at org
.elasticsearch.http.netty.NettyHttpChannel.sendResponse(NettyHttpChannel.java:158)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:106)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:98)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(TransportSearchQueryAndFetchAction.java:94)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(TransportSearchQueryAndFetchAction.java:77)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:425)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:243)
        at org
.elasticsearch.action.search.<span style="color: #
...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7374c233-51b4-4697-b534-2da65ddfb967%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

José de Zárate
Igor.
Yes, that's right. My "index only" machines are just machines that are booted just for the indexing-snapshotting task. once there is no more tasks in queue, those machines are terminated. they only handle a few indices each time (their only purpose is to "snapshot").

I will do as you tell me. I guess I'll better wait to the timeframe in which most of the restores occurs, because that's when the memory consumption grows more, so expect those postings in 5 or 6 hours. 

On Wednesday, July 2, 2014 10:29:53 AM UTC-4, Igor Motov wrote:
So, your "search-only" machines are running out of memory, while your "index-only" machines are doing fine. Did I understand you correctly? Could you send me nodes stats (curl "localhost:9200/_nodes/stats?pretty") from the machine that runs out of memory, please run stats a few times with 1 hour interval. I would like to see how memory consumption is increasing over time. Please, also run nodes info ones (curl "localhost:9200/_nodes") and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:
Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the update.

I explain:

   - we have one cluster of one machine which is only meant for serving search requests. the goal is  not to index anything to it. It contains 1.7k indices, give it or take it. 
   - every day, those 1.7k indices are reindexed, and snapshoted in pairs to a S3 repository (producint 850 snapshots)repository. 
   - every day, the one "reading only" cluster of the first point restores those 850 snapshots to "update" its 1.7k indices from that same S3 repository. 

It works like a real charm. Load has dropped dramatically, and we can set a "farm" of temporary machines to do the indexing duties. 

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is nothing in the logs that shows any error, but after a week or a few days, the host has its memory almost exhausted and elasticsearch is not responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
        at org
.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
        at org
.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:725)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:704)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:671)
        at org
.elasticsearch.common.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
        at org
.elasticsearch.http.netty.NettyHttpChannel.sendResponse(NettyHttpChannel.java:158)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:106)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:98)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(TransportSearchQueryAndFetchAction.java:94)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(TransportSearchQueryAndFetchAction.java:77)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:425)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:243)
        at org
.elasticsearch.action.search.<span style="color: #
...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a00cc733-a81c-4f8b-bdc0-b2bf4250481b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

joergprante@gmail.com
This memory issue report might be related


Jörg


On Wed, Jul 2, 2014 at 5:34 PM, JoeZ99 <[hidden email]> wrote:
Igor.
Yes, that's right. My "index only" machines are just machines that are booted just for the indexing-snapshotting task. once there is no more tasks in queue, those machines are terminated. they only handle a few indices each time (their only purpose is to "snapshot").

I will do as you tell me. I guess I'll better wait to the timeframe in which most of the restores occurs, because that's when the memory consumption grows more, so expect those postings in 5 or 6 hours. 


On Wednesday, July 2, 2014 10:29:53 AM UTC-4, Igor Motov wrote:
So, your "search-only" machines are running out of memory, while your "index-only" machines are doing fine. Did I understand you correctly? Could you send me nodes stats (curl "localhost:9200/_nodes/stats?pretty") from the machine that runs out of memory, please run stats a few times with 1 hour interval. I would like to see how memory consumption is increasing over time. Please, also run nodes info ones (curl "localhost:9200/_nodes") and post here (or send me by email) the results. Thanks!

On Wednesday, July 2, 2014 10:15:46 AM UTC-4, JoeZ99 wrote:
Hey, Igor, thanks for answering! and sorry for the delay. Didn't catch the update.

I explain:

   - we have one cluster of one machine which is only meant for serving search requests. the goal is  not to index anything to it. It contains 1.7k indices, give it or take it. 
   - every day, those 1.7k indices are reindexed, and snapshoted in pairs to a S3 repository (producint 850 snapshots)repository. 
   - every day, the one "reading only" cluster of the first point restores those 850 snapshots to "update" its 1.7k indices from that same S3 repository. 

It works like a real charm. Load has dropped dramatically, and we can set a "farm" of temporary machines to do the indexing duties. 

But memory consumption never stops growing.

we don't get any "out of memory" error or anything. In fact, there is nothing in the logs that shows any error, but after a week or a few days, the host has its memory almost exhausted and elasticsearch is not responding. The memory consumption is of course way ahead of the HEAP_SIZE
We have to restart it and, when we do it we get the following error:

java.util.concurrent.RejectedExecutionException: Worker has already been shutdown
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.registerTask(AbstractNioSelector.java:120)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:72)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.executeInIoThread(AbstractNioWorker.java:56)
        at org
.elasticsearch.common.netty.channel.socket.nio.NioWorker.executeInIoThread(NioWorker.java:36)
        at org
.elasticsearch.common.netty.channel.socket.nio.AbstractNioChannelSink.execute(AbstractNioChannelSink.java:34)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.execute(DefaultChannelPipeline.java:636)
        at org
.elasticsearch.common.netty.channel.Channels.fireExceptionCaughtLater(Channels.java:496)
        at org
.elasticsearch.common.netty.channel.AbstractChannelSink.exceptionCaught(AbstractChannelSink.java:46)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.notifyHandlerException(DefaultChannelPipeline.java:658)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:781)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:725)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71)
        at org
.elasticsearch.common.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
        at org
.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:704)
        at org
.elasticsearch.common.netty.channel.Channels.write(Channels.java:671)
        at org
.elasticsearch.common.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
        at org
.elasticsearch.http.netty.NettyHttpChannel.sendResponse(NettyHttpChannel.java:158)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:106)
        at org
.elasticsearch.rest.action.search.RestSearchAction$1.onResponse(RestSearchAction.java:98)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.innerFinishHim(TransportSearchQueryAndFetchAction.java:94)
        at org
.elasticsearch.action.search.type.TransportSearchQueryAndFetchAction$AsyncAction.moveToSecondPhase(TransportSearchQueryAndFetchAction.java:77)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.innerMoveToSecondPhase(TransportSearchTypeAction.java:425)
        at org
.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.onFirstPhaseResult(TransportSearchTypeAction.java:243)
        at org
.elasticsearch.action.search.<span style="color: #
...

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a00cc733-a81c-4f8b-bdc0-b2bf4250481b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHkMrYX_u5oPFK7o0Gr8ngU2byU71M1x61MiVJ8_tXmbA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

José de Zárate
In reply to this post by José de Zárate
Igor.
I'm posting a pdf document with some graphs I think are quite enlightening . The "jvm threads" is particularly interesting. 
the times are utc-4. and during the jvm growing period is when most of the restore process have been taking place.
Igor, I will send you the reports you asked me in an email, since they contain filesystem data. Hope you don't mind. 

The graphs contain data from two elasticsearch clusters. ES1 is the one we've  been talking about in this thread. ES4 is on cluster devoted to two indices, not very big but with a highly search demand.


txs!!!

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:
We have one one-machine cluster with about 1k indices. It used to work flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's getting its memory exhausted within 7 days of use. The cluster make about 700 restore proceedings during the day. Maybe there are some memory considerations when using the restore feature???

--
uh, <a href="http://www.youtube.com/watch?v=GMD_T7ICL0o" target="_blank" onmousedown="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;" onclick="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;">oh.

<a href="http://www.defectivebydesign.org/no-drm-in-html5" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;">

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c00e5fb-908c-4b11-8365-1cc766705940%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elasticsearch Service.pdf (289K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

Igor Motov-3
So, you are running out of threads not memory. Are you re-registering repository every time you restore from it? If you do, you might be running into this issue https://github.com/elasticsearch/elasticsearch/issues/6181

On Thursday, July 3, 2014 2:06:38 PM UTC-4, JoeZ99 wrote:
Igor.
I'm posting a pdf document with some graphs I think are quite enlightening . The "jvm threads" is particularly interesting. 
the times are utc-4. and during the jvm growing period is when most of the restore process have been taking place.
Igor, I will send you the reports you asked me in an email, since they contain filesystem data. Hope you don't mind. 

The graphs contain data from two elasticsearch clusters. ES1 is the one we've  been talking about in this thread. ES4 is on cluster devoted to two indices, not very big but with a highly search demand.


txs!!!

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:
We have one one-machine cluster with about 1k indices. It used to work flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's getting its memory exhausted within 7 days of use. The cluster make about 700 restore proceedings during the day. Maybe there are some memory considerations when using the restore feature???

--
uh, <a href="http://www.youtube.com/watch?v=GMD_T7ICL0o" target="_blank" onmousedown="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;" onclick="this.href='http://www.youtube.com/watch?v\75GMD_T7ICL0o';return true;">oh.

<a href="http://www.defectivebydesign.org/no-drm-in-html5" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.defectivebydesign.org%2Fno-drm-in-html5\46sa\75D\46sntz\0751\46usg\75AFQjCNET0reD_OfiQzkqhRP1v-mmwx1fJg';return true;">

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6705c17-ea08-48fa-873c-b44dc797a9d4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: does snapshot restore lead to a memory leak?

José de Zárate
precisely!!! I re-issue the repository PUT command every time I do the restore . I know it's not the smartest thing in the world, but I wanted to make sure the repos will always be available without worrying if the elasticsearch cluster was newly created or not.

I'll look into that.


On Thu, Jul 3, 2014 at 2:17 PM, Igor Motov <[hidden email]> wrote:
So, you are running out of threads not memory. Are you re-registering repository every time you restore from it? If you do, you might be running into this issue https://github.com/elasticsearch/elasticsearch/issues/6181


On Thursday, July 3, 2014 2:06:38 PM UTC-4, JoeZ99 wrote:
Igor.
I'm posting a pdf document with some graphs I think are quite enlightening . The "jvm threads" is particularly interesting. 
the times are utc-4. and during the jvm growing period is when most of the restore process have been taking place.
Igor, I will send you the reports you asked me in an email, since they contain filesystem data. Hope you don't mind. 

The graphs contain data from two elasticsearch clusters. ES1 is the one we've  been talking about in this thread. ES4 is on cluster devoted to two indices, not very big but with a highly search demand.


txs!!!

On Monday, June 30, 2014 8:53:10 AM UTC-4, JoeZ99 wrote:
We have one one-machine cluster with about 1k indices. It used to work flawlessly , (being with a high load, of course)

but since we started to use heavily the snapshot-restore feature, it's getting its memory exhausted within 7 days of use. The cluster make about 700 restore proceedings during the day. Maybe there are some memory considerations when using the restore feature???

--
uh, oh.


--
You received this message because you are subscribed to a topic in the Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elasticsearch/jYB9n-mXsbU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e6705c17-ea08-48fa-873c-b44dc797a9d4%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
uh, oh.


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKNaH0VcGoNoXOZW01_YaJJetE8GbXzw56HOjn97J2i4eC%3DB1A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.