Bulk indexing creates a lot of disk read OPS

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Bulk indexing creates a lot of disk read OPS

eranid
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba2a238b-aade-4dcc-be96-12675b488d80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

eranid
attachments hereby

On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c009296-df9c-4d0d-a4c7-e221a718a975%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Screen Shot 2015-04-24 at 09.39.20.png (1M) Download Attachment
Screen Shot 2015-04-24 at 09.38.59.png (180K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

eranid
Forgot some stats:

I have 10 shards, no replicas, all on the same machine.
ATM, there are some 1.5 billion records in the index.


On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote:
attachments hereby

On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

dadoonet
Merging segments could be the cause here?

David

Le 24 avr. 2015 à 09:54, Eran <[hidden email]> a écrit :

Forgot some stats:

I have 10 shards, no replicas, all on the same machine.
ATM, there are some 1.5 billion records in the index.


On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote:
attachments hereby

On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/590AAAE0-75D2-45D2-B105-864444DF6521%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

eranid
Hey David,

I suspect it indeed might be the cause, but I'm kind of a newbie here. 
What metric do I need to monitor, what would be a problematic value, and basically, how can I play with merge settings to test if I can improve this?
Some rules of thumbs for a newbie would be appreciated.

I installed the plugin SegmentSpy, and here is a screenshot, if that helps.

Eran

On Friday, April 24, 2015 at 11:02:27 AM UTC+3, David Pilato wrote:
Merging segments could be the cause here?

David

Le 24 avr. 2015 à 09:54, Eran <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="-cFKermJO28J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">era...@...> a écrit :

Forgot some stats:

I have 10 shards, no replicas, all on the same machine.
ATM, there are some 1.5 billion records in the index.


On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote:
attachments hereby

On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="-cFKermJO28J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Screen Shot 2015-04-24 at 11.42.16.png (111K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

Jason Wee
merging graph you shared, looks normal to me.

we had es with 10 shards too, and i monitor the segment using
segmentspy, the segment graph in your attachment shown pretty same
with ours.

jason

On Fri, Apr 24, 2015 at 4:45 PM, Eran <[hidden email]> wrote:

> Hey David,
>
> I suspect it indeed might be the cause, but I'm kind of a newbie here.
> What metric do I need to monitor, what would be a problematic value, and
> basically, how can I play with merge settings to test if I can improve this?
> Some rules of thumbs for a newbie would be appreciated.
>
> I installed the plugin SegmentSpy, and here is a screenshot, if that helps.
>
> Eran
>
> On Friday, April 24, 2015 at 11:02:27 AM UTC+3, David Pilato wrote:
>>
>> Merging segments could be the cause here?
>>
>> David
>>
>> Le 24 avr. 2015 à 09:54, Eran <[hidden email]> a écrit :
>>
>> Forgot some stats:
>>
>> I have 10 shards, no replicas, all on the same machine.
>> ATM, there are some 1.5 billion records in the index.
>>
>>
>> On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote:
>>>
>>> attachments hereby
>>>
>>> On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
>>>>
>>>> Hello,
>>>>
>>>> I've created an index I use for logging.
>>>>
>>>> This means there are mostly writes, and some searches once in a while.
>>>> In the phase of the first loading, I'm using several clients to
>>>> concurrently index documents using the bulk API.
>>>>
>>>> At first, indexing takes 200 ms for a bulk of 5000 documents.
>>>> As time goes by, the indexing time increases, and gets to 1000-4500 ms.
>>>>
>>>> I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory,
>>>> with an IO provisioned volume set to 7000 IOPS.
>>>>
>>>> Looking at the metrics, I see that the CPU and memory are fine, the
>>>> write IOPS are at 300, but the read IOPS have slowly gone up and got to
>>>> 7000.
>>>>
>>>> How come I'm only indexing, but most of the IOPS are read?
>>>>
>>>> I am attaching some screen captures from the BigDesk plugin, that show
>>>> the two states of the index, ater about 20% of the graphs is the point in
>>>> time where I stopped the clients, so you can see the load drop of.
>>>>
>>>> My settings are:
>>>>
>>>> threadpool.bulk.type: fixed
>>>> threadpool.bulk.size: 32                 # availableProcessors
>>>> threadpool.bulk.queue_size: 1000
>>>>
>>>> # Indices settings
>>>> indices.memory.index_buffer_size: 50%
>>>>
>>>> 376,1         97%
>>>> indices.cache.filter.expire: 6h
>>>>
>>>> bootstrap.mlockall: true
>>>>
>>>>
>>>> and I've change the index settings to:
>>>>
>>>>
>>>> {"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
>>>> I also tried "refresh_interval":"-1"
>>>>
>>>>
>>>> Please let me know what else I need to provide if needed (settings,
>>>> logs, metrics)
>>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHO4itxe6Kv7wLfNxzvC-xa73sN4UUUj0Zgqcuk4Bogix-nkjA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

dadoonet
In reply to this post by eranid
That’s normal. I was just answering that even if you think you are only writing data while indexing, you are also reading data behind the scene to merge Lucene segments.
You can potentially try to play with index.translog.flush_threshold_size 


And increase the transaction log size?

It might help reducing the number of segments generated but that said you will always have READs operations.

Actually, is it an issue for you? If not, keeping all defaults values might be good.

Best


-- 
David Pilato - Developer | Evangelist 





Le 24 avr. 2015 à 10:45, Eran <[hidden email]> a écrit :

Hey David,

I suspect it indeed might be the cause, but I'm kind of a newbie here. 
What metric do I need to monitor, what would be a problematic value, and basically, how can I play with merge settings to test if I can improve this?
Some rules of thumbs for a newbie would be appreciated.

I installed the plugin SegmentSpy, and here is a screenshot, if that helps.

Eran

On Friday, April 24, 2015 at 11:02:27 AM UTC+3, David Pilato wrote:
Merging segments could be the cause here?

David

Le 24 avr. 2015 à 09:54, Eran <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="-cFKermJO28J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" class="">era...@...> a écrit :

Forgot some stats:

I have 10 shards, no replicas, all on the same machine.
ATM, there are some 1.5 billion records in the index.


On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote:
attachments hereby

On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="-cFKermJO28J" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;" class="">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" class="">https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;" class="">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<Screen Shot 2015-04-24 at 11.42.16.png>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/C2CCCCA1-C204-43D7-A7BE-AD885AB8298A%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

eranid
It is an issue, as I am hitting 7000 read operations per second (the limit of my volume's iops)

As the index grow larger the problem worsens, and as I was once able to update with a 10 clients concurrently, now I can barely use one client.

Also, I used an _optimize endpoint to have all segments synced, and even then, the read operations spike immediately on the first indexing operation (I'm using BigDesk to follow this). So I do not think it is a merge effect, as my intuition would be a merge happens every once in a while?
Maybe this is actually a result of me not using "doc values"? could that be it?

On Friday, April 24, 2015 at 12:28:50 PM UTC+3, David Pilato wrote:
That’s normal. I was just answering that even if you think you are only writing data while indexing, you are also reading data behind the scene to merge Lucene segments.
You can potentially try to play with index.translog.flush_threshold_size 

<a href="http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-translog.html" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.elastic.co%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Findex-modules-translog.html\46sa\75D\46sntz\0751\46usg\75AFQjCNGNol8K2XrqZp2_u5vRT9QMR8WBiw';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.elastic.co%2Fguide%2Fen%2Felasticsearch%2Freference%2Fcurrent%2Findex-modules-translog.html\46sa\75D\46sntz\0751\46usg\75AFQjCNGNol8K2XrqZp2_u5vRT9QMR8WBiw';return true;">http://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-translog.html

And increase the transaction log size?

It might help reducing the number of segments generated but that said you will always have READs operations.

Actually, is it an issue for you? If not, keeping all defaults values might be good.

Best


-- 
David Pilato - Developer | Evangelist 
<a href="http://elastic.co" target="_blank" rel="nofollow" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Felastic.co\46sa\75D\46sntz\0751\46usg\75AFQjCNFdyKekEe2sE9ffaELzkrPofTtf6g';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Felastic.co\46sa\75D\46sntz\0751\46usg\75AFQjCNFdyKekEe2sE9ffaELzkrPofTtf6g';return true;">elastic.co
<a href="https://twitter.com/dadoonet" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Ftwitter.com%2Fdadoonet\46sa\75D\46sntz\0751\46usg\75AFQjCNE-DMC3YEu3X_lhRIhUzuSZGsaSqA';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Ftwitter.com%2Fdadoonet\46sa\75D\46sntz\0751\46usg\75AFQjCNE-DMC3YEu3X_lhRIhUzuSZGsaSqA';return true;">@dadoonet | <a href="https://twitter.com/elasticsearchfr" style="color:rgb(17,85,204)" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Ftwitter.com%2Felasticsearchfr\46sa\75D\46sntz\0751\46usg\75AFQjCNGfXdQ98RWFMJXdiqpKnZb5GMg0zA';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Ftwitter.com%2Felasticsearchfr\46sa\75D\46sntz\0751\46usg\75AFQjCNGfXdQ98RWFMJXdiqpKnZb5GMg0zA';return true;">@elasticsearchfr | <a href="https://twitter.com/scrutmydocs" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\75https%3A%2F%2Ftwitter.com%2Fscrutmydocs\46sa\75D\46sntz\0751\46usg\75AFQjCNGQHZ4bKdE7mbdrGbZXOxhmD7c8Fw';return true;" onclick="this.href='https://www.google.com/url?q\75https%3A%2F%2Ftwitter.com%2Fscrutmydocs\46sa\75D\46sntz\0751\46usg\75AFQjCNGQHZ4bKdE7mbdrGbZXOxhmD7c8Fw';return true;">@scrutmydocs





Le 24 avr. 2015 à 10:45, Eran <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="zcHoffFS6RAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">era...@...> a écrit :

Hey David,

I suspect it indeed might be the cause, but I'm kind of a newbie here. 
What metric do I need to monitor, what would be a problematic value, and basically, how can I play with merge settings to test if I can improve this?
Some rules of thumbs for a newbie would be appreciated.

I installed the plugin SegmentSpy, and here is a screenshot, if that helps.

Eran

On Friday, April 24, 2015 at 11:02:27 AM UTC+3, David Pilato wrote:
Merging segments could be the cause here?

David

Le 24 avr. 2015 à 09:54, Eran <[hidden email]> a écrit :

Forgot some stats:

I have 10 shards, no replicas, all on the same machine.
ATM, there are some 1.5 billion records in the index.


On Friday, April 24, 2015 at 10:18:27 AM UTC+3, Eran wrote:
attachments hereby

On Friday, April 24, 2015 at 9:49:56 AM UTC+3, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)


--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/a64e78f3-5d69-4ca1-a3c9-86735a25343d%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="zcHoffFS6RAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">elasticsearc...@googlegroups.com.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;" onclick="this.href='https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return true;">https://groups.google.com/d/msgid/elasticsearch/dd232398-080a-488c-a952-b98c2a6da903%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/optout.
<Screen Shot 2015-04-24 at 11.42.16.png>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/854e4e94-32ce-4fcd-9dba-7a0e57923b82%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

cdahlqvist
In reply to this post by eranid
Hi Eran,

Which version of Elasticsearch are you using?

Are you assigning your own document IDs or letting Elasticsearch assign them automatically?

Best regards,

Christian



On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1ee7a991-d6d5-4240-be92-e73db63cccf5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

eranid
I'm using the newest version, 1.5.1
I'm assigning my own ID using path:

"_id": {
"path": "msg_id"
},

msg_id is a self generated, hashed identifier (it's actually somewhat like a cookie ID)

On Friday, April 24, 2015 at 1:47:39 PM UTC+3, [hidden email] wrote:
Hi Eran,

Which version of Elasticsearch are you using?

Are you assigning your own document IDs or letting Elasticsearch assign them automatically?

Best regards,

Christian



On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/69c5c4f1-61b9-4dba-915a-93fba2b818e9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

cdahlqvist
In reply to this post by eranid
Hi Eran,

If you are assigning your own ID, Elasticsearch need to search and check if the document already exists before writing it. This could explain why the bulk insert performance goes down as the size of the index grows. If you are not going to update the documents, I would therefore recommend allowing Elasticsearch to assign the document ID automatically.

Best regards,

Christian



On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f3ad37d7-a070-4065-aa85-6f38d4329502%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Bulk indexing creates a lot of disk read OPS

eranid
Wow, awsome. I'll try that, Thanks!

On Friday, April 24, 2015 at 2:17:45 PM UTC+3, [hidden email] wrote:
Hi Eran,

If you are assigning your own ID, Elasticsearch need to search and check if the document already exists before writing it. This could explain why the bulk insert performance goes down as the size of the index grows. If you are not going to update the documents, I would therefore recommend allowing Elasticsearch to assign the document ID automatically.

Best regards,

Christian



On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote:
Hello,

I've created an index I use for logging.

This means there are mostly writes, and some searches once in a while.
In the phase of the first loading, I'm using several clients to concurrently index documents using the bulk API.

At first, indexing takes 200 ms for a bulk of 5000 documents.
As time goes by, the indexing time increases, and gets to 1000-4500 ms.

I am using an EC2 c3.8xl machine with 32 cores, and 60 GB of memory, with an IO provisioned volume set to 7000 IOPS.

Looking at the metrics, I see that the CPU and memory are fine, the write IOPS are at 300, but the read IOPS have slowly gone up and got to 7000.

How come I'm only indexing, but most of the IOPS are read?

I am attaching some screen captures from the BigDesk plugin, that show the two states of the index, ater about 20% of the graphs is the point in time where I stopped the clients, so you can see the load drop of.

My settings are:

threadpool.bulk.type: fixed
threadpool.bulk.size: 32                 # availableProcessors
threadpool.bulk.queue_size: 1000

# Indices settings
indices.memory.index_buffer_size: 50%
                                                                                                                                                                 376,1         97%
indices.cache.filter.expire: 6h

bootstrap.mlockall: true


and I've change the index settings to:

{"index":{"refresh_interval":"60m","translog":{"flush_threshold_size":"1gb","flush_threshold_ops":"50000"}}}
I also tried "refresh_interval":"-1"


Please let me know what else I need to provide if needed (settings, logs, metrics)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/84687c05-49a5-4e0a-9a4f-41e4136a120a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.