ES Index performance

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

ES Index performance

Oren Mazor
Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
Does this happen with search request, where you see the old data? By default, elasticsearch will refresh an index to see newly indexed docs (or deletes) every seconds. Can you use the index stats API to see if there was a bump in how long it took to refresh (there is refresh stats there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]> wrote:
Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:

> Does this happen with search request, where you see the old data? By
> default, elasticsearch will refresh an index to see newly indexed docs (or
> deletes) every seconds. Can you use the index stats API to see if there was
> a bump in how long it took to refresh (there is refresh stats there).
>
>
>
>
>
>
>
> On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]> wrote:
> > Hi all,
>
> > We've deployed elasticsearch in our production and we're incredibly
> > happy with search performance. However, we're seeing occasional issues
> > where ES seems to return an older version of a record. In some cases
> > it can take up to half an hour before the proper (latest) version of a
> > record shows up. We have two nodes with 10 shards each with one
> > replica, and the index is about 30m records and 25gb in size, so it's
> > not the smallest :)
>
> > This is pretty hard to reproduce, so its relatively hard to test. But
> > I'd love to hear ideas.
>
> > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
also, its probably worth sharing my frontend's query:

{
  "filter" : {
    "and" : [
      {
        "term": {
          "SID": $num
        }
      },
      {
        "query": {
          "query_string" : {
            "default_operator" : "AND",
            "fields": ["X","Y"],
            "query" : "$QUERY"
          }
        }
      }
    ]
  },
  "sort" : [
    {
      "Y" : {
        "order" : "desc"
      }
    }
  ],
  "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <[hidden email]> wrote:

> yup. I can see an insertion request going into ES (but not the
> response. now that I think of it), but running my query shows no
> record is available for that item.
>
> all of our records are virtually the same size (about 1kb), and the
> most insertions we'd be seeing is 10-20 per second. occasionally that
> might go up to 50.
>
> how often does refresh happen by default, and how long does it take?
>
> I'm wondering if 10 shards is not enough for the size of our index.
>
> On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
>
>
>
>
>
>
> > Does this happen with search request, where you see the old data? By
> > default, elasticsearch will refresh an index to see newly indexed docs (or
> > deletes) every seconds. Can you use the index stats API to see if there was
> > a bump in how long it took to refresh (there is refresh stats there).
>
> > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]> wrote:
> > > Hi all,
>
> > > We've deployed elasticsearch in our production and we're incredibly
> > > happy with search performance. However, we're seeing occasional issues
> > > where ES seems to return an older version of a record. In some cases
> > > it can take up to half an hour before the proper (latest) version of a
> > > record shows up. We have two nodes with 10 shards each with one
> > > replica, and the index is about 30m records and 25gb in size, so it's
> > > not the smallest :)
>
> > > This is pretty hard to reproduce, so its relatively hard to test. But
> > > I'd love to hear ideas.
>
> > > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
It makes little sense to use query_string as a filter, I suggest you don't do that. But, even when using it as a filter, you should still see changes. Can you verify its not the query? i.e. just search for a document recently added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email]> wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <[hidden email]> wrote:
> yup. I can see an insertion request going into ES (but not the
> response. now that I think of it), but running my query shows no
> record is available for that item.
>
> all of our records are virtually the same size (about 1kb), and the
> most insertions we'd be seeing is 10-20 per second. occasionally that
> might go up to 50.
>
> how often does refresh happen by default, and how long does it take?
>
> I'm wondering if 10 shards is not enough for the size of our index.
>
> On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
>
>
>
>
>
>
> > Does this happen with search request, where you see the old data? By
> > default, elasticsearch will refresh an index to see newly indexed docs (or
> > deletes) every seconds. Can you use the index stats API to see if there was
> > a bump in how long it took to refresh (there is refresh stats there).
>
> > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]> wrote:
> > > Hi all,
>
> > > We've deployed elasticsearch in our production and we're incredibly
> > > happy with search performance. However, we're seeing occasional issues
> > > where ES seems to return an older version of a record. In some cases
> > > it can take up to half an hour before the proper (latest) version of a
> > > record shows up. We have two nodes with 10 shards each with one
> > > replica, and the index is about 30m records and 25gb in size, so it's
> > > not the smallest :)
>
> > > This is pretty hard to reproduce, so its relatively hard to test. But
> > > I'd love to hear ideas.
>
> > > thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <[hidden email]> wrote:

> It makes little sense to use query_string as a filter, I suggest you don't
> do that. But, even when using it as a filter, you should still see changes.
> Can you verify its not the query? i.e. just search for a document recently
> added and see if you get it back?
>
>
>
>
>
>
>
> On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email]> wrote:
> > also, its probably worth sharing my frontend's query:
>
> > {
> >  "filter" : {
> >    "and" : [
> >      {
> >        "term": {
> >          "SID": $num
> >        }
> >      },
> >      {
> >        "query": {
> >          "query_string" : {
> >            "default_operator" : "AND",
> >            "fields": ["X","Y"],
> >            "query" : "$QUERY"
> >          }
> >        }
> >      }
> >    ]
> >  },
> >  "sort" : [
> >    {
> >      "Y" : {
> >        "order" : "desc"
> >      }
> >    }
> >  ],
> >  "size" : 1
> > }'
>
> > I understand that there is no caching involved with the AND filter,
> > but sort is a different matter (Y is a date)
>
> > On Jan 20, 12:21 am, Oren Mazor <[hidden email]> wrote:
> > > yup. I can see an insertion request going into ES (but not the
> > > response. now that I think of it), but running my query shows no
> > > record is available for that item.
>
> > > all of our records are virtually the same size (about 1kb), and the
> > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > might go up to 50.
>
> > > how often does refresh happen by default, and how long does it take?
>
> > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
> > > > Does this happen with search request, where you see the old data? By
> > > > default, elasticsearch will refresh an index to see newly indexed docs
> > (or
> > > > deletes) every seconds. Can you use the index stats API to see if
> > there was
> > > > a bump in how long it took to refresh (there is refresh stats there).
>
> > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]>
> > wrote:
> > > > > Hi all,
>
> > > > > We've deployed elasticsearch in our production and we're incredibly
> > > > > happy with search performance. However, we're seeing occasional
> > issues
> > > > > where ES seems to return an older version of a record. In some cases
> > > > > it can take up to half an hour before the proper (latest) version of
> > a
> > > > > record shows up. We have two nodes with 10 shards each with one
> > > > > replica, and the index is about 30m records and 25gb in size, so it's
> > > > > not the smallest :)
>
> > > > > This is pretty hard to reproduce, so its relatively hard to test. But
> > > > > I'd love to hear ideas.
>
> > > > > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
Hard to tell if its GC, you can monitor it using bigdesk to see changes, see how memory is behaving. Though you way you have a 30 minute "pause", which is strange. Did you check the refresh stats? Also, when this happens, can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email]> wrote:
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <[hidden email]> wrote:
> It makes little sense to use query_string as a filter, I suggest you don't
> do that. But, even when using it as a filter, you should still see changes.
> Can you verify its not the query? i.e. just search for a document recently
> added and see if you get it back?
>
>
>
>
>
>
>
> On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email]> wrote:
> > also, its probably worth sharing my frontend's query:
>
> > {
> >  "filter" : {
> >    "and" : [
> >      {
> >        "term": {
> >          "SID": $num
> >        }
> >      },
> >      {
> >        "query": {
> >          "query_string" : {
> >            "default_operator" : "AND",
> >            "fields": ["X","Y"],
> >            "query" : "$QUERY"
> >          }
> >        }
> >      }
> >    ]
> >  },
> >  "sort" : [
> >    {
> >      "Y" : {
> >        "order" : "desc"
> >      }
> >    }
> >  ],
> >  "size" : 1
> > }'
>
> > I understand that there is no caching involved with the AND filter,
> > but sort is a different matter (Y is a date)
>
> > On Jan 20, 12:21 am, Oren Mazor <[hidden email]> wrote:
> > > yup. I can see an insertion request going into ES (but not the
> > > response. now that I think of it), but running my query shows no
> > > record is available for that item.
>
> > > all of our records are virtually the same size (about 1kb), and the
> > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > might go up to 50.
>
> > > how often does refresh happen by default, and how long does it take?
>
> > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
> > > > Does this happen with search request, where you see the old data? By
> > > > default, elasticsearch will refresh an index to see newly indexed docs
> > (or
> > > > deletes) every seconds. Can you use the index stats API to see if
> > there was
> > > > a bump in how long it took to refresh (there is refresh stats there).
>
> > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]>
> > wrote:
> > > > > Hi all,
>
> > > > > We've deployed elasticsearch in our production and we're incredibly
> > > > > happy with search performance. However, we're seeing occasional
> > issues
> > > > > where ES seems to return an older version of a record. In some cases
> > > > > it can take up to half an hour before the proper (latest) version of
> > a
> > > > > record shows up. We have two nodes with 10 shards each with one
> > > > > replica, and the index is about 30m records and 25gb in size, so it's
> > > > > not the smallest :)
>
> > > > > This is pretty hard to reproduce, so its relatively hard to test. But
> > > > > I'd love to hear ideas.
>
> > > > > thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <[hidden email]> wrote:

> Hard to tell if its GC, you can monitor it using bigdesk to see changes,
> see how memory is behaving. Though you way you have a 30 minute "pause",
> which is strange. Did you check the refresh stats? Also, when this happens,
> can you simply get by id the relevant new / modified document?
>
>
>
>
>
>
>
> On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email]> wrote:
> > Yup. I've done direct queries for a document that should be there, and
> > even 30 minutes later, it is still not available.
>
> > based on the semi-regular pattern of these delays, I'm wondering if
> > there's some kind of memory or gc issue playing up?
>
> > we have two nodes with 16gb/32 on the first, and 10/24 on the second.
>
> > On Jan 20, 10:06 am, Shay Banon <[hidden email]> wrote:
> > > It makes little sense to use query_string as a filter, I suggest you
> > don't
> > > do that. But, even when using it as a filter, you should still see
> > changes.
> > > Can you verify its not the query? i.e. just search for a document
> > recently
> > > added and see if you get it back?
>
> > > On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email]>
> > wrote:
> > > > also, its probably worth sharing my frontend's query:
>
> > > > {
> > > >  "filter" : {
> > > >    "and" : [
> > > >      {
> > > >        "term": {
> > > >          "SID": $num
> > > >        }
> > > >      },
> > > >      {
> > > >        "query": {
> > > >          "query_string" : {
> > > >            "default_operator" : "AND",
> > > >            "fields": ["X","Y"],
> > > >            "query" : "$QUERY"
> > > >          }
> > > >        }
> > > >      }
> > > >    ]
> > > >  },
> > > >  "sort" : [
> > > >    {
> > > >      "Y" : {
> > > >        "order" : "desc"
> > > >      }
> > > >    }
> > > >  ],
> > > >  "size" : 1
> > > > }'
>
> > > > I understand that there is no caching involved with the AND filter,
> > > > but sort is a different matter (Y is a date)
>
> > > > On Jan 20, 12:21 am, Oren Mazor <[hidden email]> wrote:
> > > > > yup. I can see an insertion request going into ES (but not the
> > > > > response. now that I think of it), but running my query shows no
> > > > > record is available for that item.
>
> > > > > all of our records are virtually the same size (about 1kb), and the
> > > > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > > > might go up to 50.
>
> > > > > how often does refresh happen by default, and how long does it take?
>
> > > > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
> > > > > > Does this happen with search request, where you see the old data?
> > By
> > > > > > default, elasticsearch will refresh an index to see newly indexed
> > docs
> > > > (or
> > > > > > deletes) every seconds. Can you use the index stats API to see if
> > > > there was
> > > > > > a bump in how long it took to refresh (there is refresh stats
> > there).
>
> > > > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]>
> > > > wrote:
> > > > > > > Hi all,
>
> > > > > > > We've deployed elasticsearch in our production and we're
> > incredibly
> > > > > > > happy with search performance. However, we're seeing occasional
> > > > issues
> > > > > > > where ES seems to return an older version of a record. In some
> > cases
> > > > > > > it can take up to half an hour before the proper (latest)
> > version of
> > > > a
> > > > > > > record shows up. We have two nodes with 10 shards each with one
> > > > > > > replica, and the index is about 30m records and 25gb in size, so
> > it's
> > > > > > > not the smallest :)
>
> > > > > > > This is pretty hard to reproduce, so its relatively hard to
> > test. But
> > > > > > > I'd love to hear ideas.
>
> > > > > > > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com> wrote:
Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?







On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com> wrote:
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com> wrote:
It makes little sense to use query_string as a filter, I suggest you
don't
do that. But, even when using it as a filter, you should still see
changes.
Can you verify its not the query? i.e. just search for a document
recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com>
wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com> wrote:
yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed
docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats
there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com>
wrote:
Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some
cases
it can take up to half an hour before the proper (latest)
version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so
it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <[hidden email]> wrote:

> Great, thanks for the update!
>
>
>
>
>
>
>
> On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:
> > Hi Shay,
>
> > just a follow up (because I hate it when there is no closure).
>
> > I modified my import script to use bulk imports, so instead of 10
> > insertions a second, I now end up doing one bulk insertion every ten
> > seconds. I had it up to a minute, but I think inserting 600-800
> > records in one bulk request was causing some problems, so I shortened
> > the frequency.
>
> > so far I'm not seeeing any serious delays in testing this week, but
> > tomorrow I'll do some bigger load testing with our big index. it seems
> > promising at the moment!
>
> > On Jan 20, 2:26 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > Hard to tell if its GC, you can monitor it using bigdesk to see changes,
> > > see how memory is behaving. Though you way you have a 30 minute "pause",
> > > which is strange. Did you check the refresh stats? Also, when this happens,
> > > can you simply get by id the relevant new / modified document?
>
> > > On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
> > > > Yup. I've done direct queries for a document that should be there, and
> > > > even 30 minutes later, it is still not available.
>
> > > > based on the semi-regular pattern of these delays, I'm wondering if
> > > > there's some kind of memory or gc issue playing up?
>
> > > > we have two nodes with 16gb/32 on the first, and 10/24 on the second.
>
> > > > On Jan 20, 10:06 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > > > It makes little sense to use query_string as a filter, I suggest you
>
> > > > don't
> > > > > do that. But, even when using it as a filter, you should still see
>
> > > > changes.
> > > > > Can you verify its not the query? i.e. just search for a document
>
> > > > recently
> > > > > added and see if you get it back?
>
> > > > > On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email] (http://gmail.com)>
> > > > wrote:
> > > > > > also, its probably worth sharing my frontend's query:
>
> > > > > > {
> > > > > >  "filter" : {
> > > > > >    "and" : [
> > > > > >      {
> > > > > >        "term": {
> > > > > >          "SID": $num
> > > > > >        }
> > > > > >      },
> > > > > >      {
> > > > > >        "query": {
> > > > > >          "query_string" : {
> > > > > >            "default_operator" : "AND",
> > > > > >            "fields": ["X","Y"],
> > > > > >            "query" : "$QUERY"
> > > > > >          }
> > > > > >        }
> > > > > >      }
> > > > > >    ]
> > > > > >  },
> > > > > >  "sort" : [
> > > > > >    {
> > > > > >      "Y" : {
> > > > > >        "order" : "desc"
> > > > > >      }
> > > > > >    }
> > > > > >  ],
> > > > > >  "size" : 1
> > > > > > }'
>
> > > > > > I understand that there is no caching involved with the AND filter,
> > > > > > but sort is a different matter (Y is a date)
>
> > > > > > On Jan 20, 12:21 am, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
> > > > > > > yup. I can see an insertion request going into ES (but not the
> > > > > > > response. now that I think of it), but running my query shows no
> > > > > > > record is available for that item.
>
> > > > > > > all of our records are virtually the same size (about 1kb), and the
> > > > > > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > > > > > might go up to 50.
>
> > > > > > > how often does refresh happen by default, and how long does it take?
>
> > > > > > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
>
> > > > > > > > Does this happen with search request, where you see the old data?
> > > > By
> > > > > > > > default, elasticsearch will refresh an index to see newly indexed
>
> > > > docs
> > > > > > (or
> > > > > > > > deletes) every seconds. Can you use the index stats API to see if
>
> > > > > > there was
> > > > > > > > a bump in how long it took to refresh (there is refresh stats
>
> > > > there).
>
> > > > > > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email] (http://gmail.com)>
> > > > > > wrote:
> > > > > > > > > Hi all,
>
> > > > > > > > > We've deployed elasticsearch in our production and we're
> > > > incredibly
> > > > > > > > > happy with search performance. However, we're seeing occasional
>
> > > > > > issues
> > > > > > > > > where ES seems to return an older version of a record. In some
>
> > > > cases
> > > > > > > > > it can take up to half an hour before the proper (latest)
>
> > > > version of
> > > > > > a
> > > > > > > > > record shows up. We have two nodes with 10 shards each with one
> > > > > > > > > replica, and the index is about 30m records and 25gb in size, so
>
> > > > it's
> > > > > > > > > not the smallest :)
>
> > > > > > > > > This is pretty hard to reproduce, so its relatively hard to
> > > > test. But
> > > > > > > > > I'd love to hear ideas.
>
> > > > > > > > > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:
so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API?  How many clients are indexing the data?

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?
Yes, thats the total time that was spent doing it.
 

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com> wrote:
Great, thanks for the update!







On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:
Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
It makes little sense to use query_string as a filter, I suggest you

don't
do that. But, even when using it as a filter, you should still see

changes.
Can you verify its not the query? i.e. just search for a document

recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email] (http://gmail.com)>
wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs
(or
deletes) every seconds. Can you use the index stats API to see if

there was
a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email] (http://gmail.com)>
wrote:
Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues
where ES seems to return an older version of a record. In some

cases
it can take up to half an hour before the proper (latest)

version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

K.B.
In reply to this post by Oren Mazor
Hello Oren,

Im having a similar problem, meaning ES is nearly unresponsive during
the index of a large batch insert - in my case its about 1300 to 1600
batch items per batch insert following a whole index drop and create
cycle. Can you please tell me what the best batch size was so you
didnt encounter any delays on the system?

Best

KB

On 25 Jan., 07:54, Oren Mazor <[hidden email]> wrote:

> Hi Shay,
>
> just a follow up (because I hate it when there is no closure).
>
> I modified my import script to use bulk imports, so instead of 10
> insertions a second, I now end up doing one bulk insertion every ten
> seconds. I had it up to a minute, but I think inserting 600-800
> records in one bulk request was causing some problems, so I shortened
> the frequency.
>
> so far I'm not seeeing any serious delays in testing this week, but
> tomorrow I'll do some bigger load testing with our big index. it seems
> promising at the moment!
>
> On Jan 20, 2:26 pm, Shay Banon <[hidden email]> wrote:
>
>
>
>
>
>
>
> > Hard to tell if its GC, you can monitor it using bigdesk to see changes,
> > see how memory is behaving. Though you way you have a 30 minute "pause",
> > which is strange. Did you check the refresh stats? Also, when this happens,
> > can you simply get by id the relevant new / modified document?
>
> > On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email]> wrote:
> > > Yup. I've done direct queries for a document that should be there, and
> > > even 30 minutes later, it is still not available.
>
> > > based on the semi-regular pattern of these delays, I'm wondering if
> > > there's some kind of memory or gc issue playing up?
>
> > > we have two nodes with 16gb/32 on the first, and 10/24 on the second.
>
> > > On Jan 20, 10:06 am, Shay Banon <[hidden email]> wrote:
> > > > It makes little sense to use query_string as a filter, I suggest you
> > > don't
> > > > do that. But, even when using it as a filter, you should still see
> > > changes.
> > > > Can you verify its not the query? i.e. just search for a document
> > > recently
> > > > added and see if you get it back?
>
> > > > On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email]>
> > > wrote:
> > > > > also, its probably worth sharing my frontend's query:
>
> > > > > {
> > > > >  "filter" : {
> > > > >    "and" : [
> > > > >      {
> > > > >        "term": {
> > > > >          "SID": $num
> > > > >        }
> > > > >      },
> > > > >      {
> > > > >        "query": {
> > > > >          "query_string" : {
> > > > >            "default_operator" : "AND",
> > > > >            "fields": ["X","Y"],
> > > > >            "query" : "$QUERY"
> > > > >          }
> > > > >        }
> > > > >      }
> > > > >    ]
> > > > >  },
> > > > >  "sort" : [
> > > > >    {
> > > > >      "Y" : {
> > > > >        "order" : "desc"
> > > > >      }
> > > > >    }
> > > > >  ],
> > > > >  "size" : 1
> > > > > }'
>
> > > > > I understand that there is no caching involved with the AND filter,
> > > > > but sort is a different matter (Y is a date)
>
> > > > > On Jan 20, 12:21 am, Oren Mazor <[hidden email]> wrote:
> > > > > > yup. I can see an insertion request going into ES (but not the
> > > > > > response. now that I think of it), but running my query shows no
> > > > > > record is available for that item.
>
> > > > > > all of our records are virtually the same size (about 1kb), and the
> > > > > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > > > > might go up to 50.
>
> > > > > > how often does refresh happen by default, and how long does it take?
>
> > > > > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email]> wrote:
>
> > > > > > > Does this happen with search request, where you see the old data?
> > > By
> > > > > > > default, elasticsearch will refresh an index to see newly indexed
> > > docs
> > > > > (or
> > > > > > > deletes) every seconds. Can you use the index stats API to see if
> > > > > there was
> > > > > > > a bump in how long it took to refresh (there is refresh stats
> > > there).
>
> > > > > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email]>
> > > > > wrote:
> > > > > > > > Hi all,
>
> > > > > > > > We've deployed elasticsearch in our production and we're
> > > incredibly
> > > > > > > > happy with search performance. However, we're seeing occasional
> > > > > issues
> > > > > > > > where ES seems to return an older version of a record. In some
> > > cases
> > > > > > > > it can take up to half an hour before the proper (latest)
> > > version of
> > > > > a
> > > > > > > > record shows up. We have two nodes with 10 shards each with one
> > > > > > > > replica, and the index is about 30m records and 25gb in size, so
> > > it's
> > > > > > > > not the smallest :)
>
> > > > > > > > This is pretty hard to reproduce, so its relatively hard to
> > > test. But
> > > > > > > > I'd love to hear ideas.
>
> > > > > > > > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
The optimal batch size is really dependent on what you index. Indexing 100 items with 1mb size is different than indexing 100 items with 1k size. Also, it depends on how many concurrent clients are issuing the bulk requests.

On Monday, February 6, 2012 at 2:05 PM, K.B. wrote:

Hello Oren,

Im having a similar problem, meaning ES is nearly unresponsive during
the index of a large batch insert - in my case its about 1300 to 1600
batch items per batch insert following a whole index drop and create
cycle. Can you please tell me what the best batch size was so you
didnt encounter any delays on the system?

Best

KB

On 25 Jan., 07:54, Oren Mazor <oren.ma...@gmail.com> wrote:
Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com> wrote:







Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com> wrote:
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com> wrote:
It makes little sense to use query_string as a filter, I suggest you
don't
do that. But, even when using it as a filter, you should still see
changes.
Can you verify its not the query? i.e. just search for a document
recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com>
wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com> wrote:
yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed
docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats
there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com>
wrote:
Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some
cases
it can take up to half an hour before the proper (latest)
version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so
it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
In reply to this post by kimchy
hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :)

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <[hidden email]> wrote:

> On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:
> > so, I'm starting to see these again on heavy really heavy load (lets
> > say around 10k insertions a minute)
>
> Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API?  How many clients are indexing the data?
>
>
>
> > I'm still having some difficulty wrapping my head around the algorithm
> > in the bottom end. the refresh total_time is 17h and merges is 14.9h.
> > this seems pretty ambiguous. I'm guessing its the total time spent
> > executing these actions rather than the time since, right?
>
> Yes, thats the total time that was spent doing it.
>
>
>
>
>
>
>
>
>
> > are there some hardware settings I can make to make lucene go faster?
> > also, is there anything I can read to level up on understanding the
> > low level side of things? I'm going through the ES code to start with
> > and learning more there.
>
> > On Jan 25, 9:37 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > Great, thanks for the update!
>
> > > On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:
> > > > Hi Shay,
>
> > > > just a follow up (because I hate it when there is no closure).
>
> > > > I modified my import script to use bulk imports, so instead of 10
> > > > insertions a second, I now end up doing one bulk insertion every ten
> > > > seconds. I had it up to a minute, but I think inserting 600-800
> > > > records in one bulk request was causing some problems, so I shortened
> > > > the frequency.
>
> > > > so far I'm not seeeing any serious delays in testing this week, but
> > > > tomorrow I'll do some bigger load testing with our big index. it seems
> > > > promising at the moment!
>
> > > > On Jan 20, 2:26 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > > > Hard to tell if its GC, you can monitor it using bigdesk to see changes,
> > > > > see how memory is behaving. Though you way you have a 30 minute "pause",
> > > > > which is strange. Did you check the refresh stats? Also, when this happens,
> > > > > can you simply get by id the relevant new / modified document?
>
> > > > > On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
> > > > > > Yup. I've done direct queries for a document that should be there, and
> > > > > > even 30 minutes later, it is still not available.
>
> > > > > > based on the semi-regular pattern of these delays, I'm wondering if
> > > > > > there's some kind of memory or gc issue playing up?
>
> > > > > > we have two nodes with 16gb/32 on the first, and 10/24 on the second.
>
> > > > > > On Jan 20, 10:06 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > > > > > It makes little sense to use query_string as a filter, I suggest you
>
> > > > > > don't
> > > > > > > do that. But, even when using it as a filter, you should still see
>
> > > > > > changes.
> > > > > > > Can you verify its not the query? i.e. just search for a document
>
> > > > > > recently
> > > > > > > added and see if you get it back?
>
> > > > > > > On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email] (http://gmail.com)>
> > > > > > wrote:
> > > > > > > > also, its probably worth sharing my frontend's query:
>
> > > > > > > > {
> > > > > > > >  "filter" : {
> > > > > > > >    "and" : [
> > > > > > > >      {
> > > > > > > >        "term": {
> > > > > > > >          "SID": $num
> > > > > > > >        }
> > > > > > > >      },
> > > > > > > >      {
> > > > > > > >        "query": {
> > > > > > > >          "query_string" : {
> > > > > > > >            "default_operator" : "AND",
> > > > > > > >            "fields": ["X","Y"],
> > > > > > > >            "query" : "$QUERY"
> > > > > > > >          }
> > > > > > > >        }
> > > > > > > >      }
> > > > > > > >    ]
> > > > > > > >  },
> > > > > > > >  "sort" : [
> > > > > > > >    {
> > > > > > > >      "Y" : {
> > > > > > > >        "order" : "desc"
> > > > > > > >      }
> > > > > > > >    }
> > > > > > > >  ],
> > > > > > > >  "size" : 1
> > > > > > > > }'
>
> > > > > > > > I understand that there is no caching involved with the AND filter,
> > > > > > > > but sort is a different matter (Y is a date)
>
> > > > > > > > On Jan 20, 12:21 am, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
> > > > > > > > > yup. I can see an insertion request going into ES (but not the
> > > > > > > > > response. now that I think of it), but running my query shows no
> > > > > > > > > record is available for that item.
>
> > > > > > > > > all of our records are virtually the same size (about 1kb), and the
> > > > > > > > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > > > > > > > might go up to 50.
>
> > > > > > > > > how often does refresh happen by default, and how long does it take?
>
> > > > > > > > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > > > > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
>
> > > > > > > > > > Does this happen with search request, where you see the old data?
> > > > > > By
> > > > > > > > > > default, elasticsearch will refresh an index to see newly indexed
>
> > > > > > docs
> > > > > > > > (or
> > > > > > > > > > deletes) every seconds. Can you use the index stats API to see if
>
> > > > > > > > there was
> > > > > > > > > > a bump in how long it took to refresh (there is refresh stats
>
> > > > > > there).
>
> > > > > > > > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email] (http://gmail.com)>
> > > > > > > > wrote:
> > > > > > > > > > > Hi all,
>
> > > > > > > > > > > We've deployed elasticsearch in our production and we're
> > > > > > incredibly
> > > > > > > > > > > happy with search performance. However, we're seeing occasional
>
> > > > > > > > issues
> > > > > > > > > > > where ES seems to return an older version of a record. In some
>
> > > > > > cases
> > > > > > > > > > > it can take up to half an hour before the proper (latest)
>
> > > > > > version of
> > > > > > > > a
> > > > > > > > > > > record shows up. We have two nodes with 10 shards each with one
> > > > > > > > > > > replica, and the index is about 30m records and 25gb in size, so
>
> > > > > > it's
> > > > > > > > > > > not the smallest :)
>
> > > > > > > > > > > This is pretty hard to reproduce, so its relatively hard to
> > > > > > test. But
> > > > > > > > > > > I'd love to hear ideas.
>
> > > > > > > > > > > thanks!
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Michael Sick
In reply to this post by kimchy
Shay,

Is there a max in data size or data size / node? All valid rules of thumb are welcome. 


On Tue, Feb 7, 2012 at 5:05 AM, Shay Banon <[hidden email]> wrote:
The optimal batch size is really dependent on what you index. Indexing 100 items with 1mb size is different than indexing 100 items with 1k size. Also, it depends on how many concurrent clients are issuing the bulk requests.

On Monday, February 6, 2012 at 2:05 PM, K.B. wrote:

Hello Oren,

Im having a similar problem, meaning ES is nearly unresponsive during
the index of a large batch insert - in my case its about 1300 to 1600
batch items per batch insert following a whole index drop and create
cycle. Can you please tell me what the best batch size was so you
didnt encounter any delays on the system?

Best

KB

On 25 Jan., 07:54, Oren Mazor <oren.ma...@gmail.com> wrote:
Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com> wrote:







Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com> wrote:
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com> wrote:
It makes little sense to use query_string as a filter, I suggest you
don't
do that. But, even when using it as a filter, you should still see
changes.
Can you verify its not the query? i.e. just search for a document
recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com>
wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com> wrote:
yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed
docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats
there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com>
wrote:
Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some
cases
it can take up to half an hour before the proper (latest)
version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so
it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!


Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
In reply to this post by Oren Mazor
Going back to your question, do you see that issuing a Get (which is realtime) does not return the correct version of the data? I would be helpful to understand where the stalling is coming from. If a "get" does not return your expect version of the data, it means that it didn't get indexed, so you will need to look at the indexer code and see if maybe something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough resources on the machine CPU/Mem, overloading the machines you have in the cluster, GC… .

Which JVM version are you using? Are you running on EC2? If so, which instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM, Oren Mazor wrote:

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :)

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <kim...@gmail.com> wrote:
On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:
so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API?  How many clients are indexing the data?



I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.









are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:
Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
It makes little sense to use query_string as a filter, I suggest you

don't
do that. But, even when using it as a filter, you should still see

changes.
Can you verify its not the query? i.e. just search for a document

recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email] (http://gmail.com)>
wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs
(or
deletes) every seconds. Can you use the index stats API to see if

there was
a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email] (http://gmail.com)>
wrote:
Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues
where ES seems to return an older version of a record. In some

cases
it can take up to half an hour before the proper (latest)

version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's
not the smallest :)

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :)

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops up…?

On Feb 7, 2:00 pm, Shay Banon <[hidden email]> wrote:

> Going back to your question, do you see that issuing a Get (which is realtime) does not return the correct version of the data? I would be helpful to understand where the stalling is coming from. If a "get" does not return your expect version of the data, it means that it didn't get indexed, so you will need to look at the indexer code and see if maybe something is stalling on the bulk API execution.
>
> The stalling can be for many reasons, starting with slow IO, not enough resources on the machine CPU/Mem, overloading the machines you have in the cluster, GC… .
>
> Which JVM version are you using? Are you running on EC2? If so, which instances / os version? How many shards do you have in the index?
>
>
>
>
>
>
>
> On Tuesday, February 7, 2012 at 5:04 PM, Oren Mazor wrote:
> > hi Shay,
>
> > we use server density to keep track of ES and I'm not seeing any
> > spikes in resource use at all. I'm suspecting we're just pushing it
> > more than most people do?
>
> > I do use the bulk API to send perhaps 1000 insertions every 10 seconds
> > on average. in some cases these are new records, and sometimes they
> > are new versions of existing records. I also send some amount of
> > deletes every minute, but these are not really using the bulk api.
>
> > could you recommend a more low level primer on ES? I'd love to have a
> > more low level understanding of how/why things work. it'll make it
> > easier for me to tune my algorithms. I can go through the source, but
> > if there're some papers out there I could read, that'd be better :)
>
> > thanks!
> > Oren
>
> > PS. I noticed there is now a mongodb river in development. I'm
> > wondering whether my efforts might be better spent helping it to
> > production status rather than trying to tune my own code..
>
> > On Feb 5, 4:31 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:
> > > > so, I'm starting to see these again on heavy really heavy load (lets
> > > > say around 10k insertions a minute)
>
> > > Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API?  How many clients are indexing the data?
>
> > > > I'm still having some difficulty wrapping my head around the algorithm
> > > > in the bottom end. the refresh total_time is 17h and merges is 14.9h.
> > > > this seems pretty ambiguous. I'm guessing its the total time spent
> > > > executing these actions rather than the time since, right?
>
> > > Yes, thats the total time that was spent doing it.
>
> > > > are there some hardware settings I can make to make lucene go faster?
> > > > also, is there anything I can read to level up on understanding the
> > > > low level side of things? I'm going through the ES code to start with
> > > > and learning more there.
>
> > > > On Jan 25, 9:37 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > > > Great, thanks for the update!
>
> > > > > On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:
> > > > > > Hi Shay,
>
> > > > > > just a follow up (because I hate it when there is no closure).
>
> > > > > > I modified my import script to use bulk imports, so instead of 10
> > > > > > insertions a second, I now end up doing one bulk insertion every ten
> > > > > > seconds. I had it up to a minute, but I think inserting 600-800
> > > > > > records in one bulk request was causing some problems, so I shortened
> > > > > > the frequency.
>
> > > > > > so far I'm not seeeing any serious delays in testing this week, but
> > > > > > tomorrow I'll do some bigger load testing with our big index. it seems
> > > > > > promising at the moment!
>
> > > > > > On Jan 20, 2:26 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > > > > > Hard to tell if its GC, you can monitor it using bigdesk to see changes,
> > > > > > > see how memory is behaving. Though you way you have a 30 minute "pause",
> > > > > > > which is strange. Did you check the refresh stats? Also, when this happens,
> > > > > > > can you simply get by id the relevant new / modified document?
>
> > > > > > > On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
> > > > > > > > Yup. I've done direct queries for a document that should be there, and
> > > > > > > > even 30 minutes later, it is still not available.
>
> > > > > > > > based on the semi-regular pattern of these delays, I'm wondering if
> > > > > > > > there's some kind of memory or gc issue playing up?
>
> > > > > > > > we have two nodes with 16gb/32 on the first, and 10/24 on the second.
>
> > > > > > > > On Jan 20, 10:06 am, Shay Banon <[hidden email] (http://gmail.com)> wrote:
> > > > > > > > > It makes little sense to use query_string as a filter, I suggest you
>
> > > > > > > > don't
> > > > > > > > > do that. But, even when using it as a filter, you should still see
>
> > > > > > > > changes.
> > > > > > > > > Can you verify its not the query? i.e. just search for a document
>
> > > > > > > > recently
> > > > > > > > > added and see if you get it back?
>
> > > > > > > > > On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <[hidden email] (http://gmail.com)>
> > > > > > > > wrote:
> > > > > > > > > > also, its probably worth sharing my frontend's query:
>
> > > > > > > > > > {
> > > > > > > > > >  "filter" : {
> > > > > > > > > >    "and" : [
> > > > > > > > > >      {
> > > > > > > > > >        "term": {
> > > > > > > > > >          "SID": $num
> > > > > > > > > >        }
> > > > > > > > > >      },
> > > > > > > > > >      {
> > > > > > > > > >        "query": {
> > > > > > > > > >          "query_string" : {
> > > > > > > > > >            "default_operator" : "AND",
> > > > > > > > > >            "fields": ["X","Y"],
> > > > > > > > > >            "query" : "$QUERY"
> > > > > > > > > >          }
> > > > > > > > > >        }
> > > > > > > > > >      }
> > > > > > > > > >    ]
> > > > > > > > > >  },
> > > > > > > > > >  "sort" : [
> > > > > > > > > >    {
> > > > > > > > > >      "Y" : {
> > > > > > > > > >        "order" : "desc"
> > > > > > > > > >      }
> > > > > > > > > >    }
> > > > > > > > > >  ],
> > > > > > > > > >  "size" : 1
> > > > > > > > > > }'
>
> > > > > > > > > > I understand that there is no caching involved with the AND filter,
> > > > > > > > > > but sort is a different matter (Y is a date)
>
> > > > > > > > > > On Jan 20, 12:21 am, Oren Mazor <[hidden email] (http://gmail.com)> wrote:
> > > > > > > > > > > yup. I can see an insertion request going into ES (but not the
> > > > > > > > > > > response. now that I think of it), but running my query shows no
> > > > > > > > > > > record is available for that item.
>
> > > > > > > > > > > all of our records are virtually the same size (about 1kb), and the
> > > > > > > > > > > most insertions we'd be seeing is 10-20 per second. occasionally that
> > > > > > > > > > > might go up to 50.
>
> > > > > > > > > > > how often does refresh happen by default, and how long does it take?
>
> > > > > > > > > > > I'm wondering if 10 shards is not enough for the size of our index.
>
> > > > > > > > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email] (http://gmail.com)> wrote:
>
> > > > > > > > > > > > Does this happen with search request, where you see the old data?
> > > > > > > > By
> > > > > > > > > > > > default, elasticsearch will refresh an index to see newly indexed
>
> > > > > > > > docs
> > > > > > > > > > (or
> > > > > > > > > > > > deletes) every seconds. Can you use the index stats API to see if
>
> > > > > > > > > > there was
> > > > > > > > > > > > a bump in how long it took to refresh (there is refresh stats
>
> > > > > > > > there).
>
> > > > > > > > > > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <[hidden email] (http://gmail.com)>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > Hi all,
>
> > > > > > > > > > > > > We've deployed elasticsearch in our production and we're
> > > > > > > > incredibly
> > > > > > > > > > > > > happy with search performance. However, we're seeing occasional
>
> ...
>
> read more »
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

medcl.net
hi,Oren Mazor
did you checked the response of the bulk  operation,are they all successful
indexed?
and also check your translog status.
also manually refresh the index and to see if can get the current version


-----Original Message-----
From: Oren Mazor
Sent: Friday, February 10, 2012 2:25 AM
To: elasticsearch
Subject: Re: ES Index performance

Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :)

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops up…?

On Feb 7, 2:00 pm, Shay Banon <[hidden email]> wrote:

> Going back to your question, do you see that issuing a Get (which is
> realtime) does not return the correct version of the data? I would be
> helpful to understand where the stalling is coming from. If a "get" does
> not return your expect version of the data, it means that it didn't get
> indexed, so you will need to look at the indexer code and see if maybe
> something is stalling on the bulk API execution.
>
> The stalling can be for many reasons, starting with slow IO, not enough
> resources on the machine CPU/Mem, overloading the machines you have in the
> cluster, GC… .
>
> Which JVM version are you using? Are you running on EC2? If so, which
> instances / os version? How many shards do you have in the index?
>
>
>
>
>
>
>
> On Tuesday, February 7, 2012 at 5:04 PM, Oren Mazor wrote:
> > hi Shay,
>
> > we use server density to keep track of ES and I'm not seeing any
> > spikes in resource use at all. I'm suspecting we're just pushing it
> > more than most people do?
>
> > I do use the bulk API to send perhaps 1000 insertions every 10 seconds
> > on average. in some cases these are new records, and sometimes they
> > are new versions of existing records. I also send some amount of
> > deletes every minute, but these are not really using the bulk api.
>
> > could you recommend a more low level primer on ES? I'd love to have a
> > more low level understanding of how/why things work. it'll make it
> > easier for me to tune my algorithms. I can go through the source, but
> > if there're some papers out there I could read, that'd be better :)
>
> > thanks!
> > Oren
>
> > PS. I noticed there is now a mongodb river in development. I'm
> > wondering whether my efforts might be better spent helping it to
> > production status rather than trying to tune my own code..
>
> > On Feb 5, 4:31 pm, Shay Banon <[hidden email] (http://gmail.com)>
> > wrote:
> > > On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:
> > > > so, I'm starting to see these again on heavy really heavy load (lets
> > > > say around 10k insertions a minute)
>
> > > Whats the behavior of elasticsearch in this case? Memory usage ok?
> > > When you say 10k inserts per minute, is that using the bulk API?  How
> > > many clients are indexing the data?
>
> > > > I'm still having some difficulty wrapping my head around the
> > > > algorithm
> > > > in the bottom end. the refresh total_time is 17h and merges is
> > > > 14.9h.
> > > > this seems pretty ambiguous. I'm guessing its the total time spent
> > > > executing these actions rather than the time since, right?
>
> > > Yes, thats the total time that was spent doing it.
>
> > > > are there some hardware settings I can make to make lucene go
> > > > faster?
> > > > also, is there anything I can read to level up on understanding the
> > > > low level side of things? I'm going through the ES code to start
> > > > with
> > > > and learning more there.
>
> > > > On Jan 25, 9:37 am, Shay Banon <[hidden email] (http://gmail.com)>
> > > > wrote:
> > > > > Great, thanks for the update!
>
> > > > > On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:
> > > > > > Hi Shay,
>
> > > > > > just a follow up (because I hate it when there is no closure).
>
> > > > > > I modified my import script to use bulk imports, so instead of
> > > > > > 10
> > > > > > insertions a second, I now end up doing one bulk insertion every
> > > > > > ten
> > > > > > seconds. I had it up to a minute, but I think inserting 600-800
> > > > > > records in one bulk request was causing some problems, so I
> > > > > > shortened
> > > > > > the frequency.
>
> > > > > > so far I'm not seeeing any serious delays in testing this week,
> > > > > > but
> > > > > > tomorrow I'll do some bigger load testing with our big index. it
> > > > > > seems
> > > > > > promising at the moment!
>
> > > > > > On Jan 20, 2:26 pm, Shay Banon <[hidden email]
> > > > > > (http://gmail.com)> wrote:
> > > > > > > Hard to tell if its GC, you can monitor it using bigdesk to
> > > > > > > see changes,
> > > > > > > see how memory is behaving. Though you way you have a 30
> > > > > > > minute "pause",
> > > > > > > which is strange. Did you check the refresh stats? Also, when
> > > > > > > this happens,
> > > > > > > can you simply get by id the relevant new / modified document?
>
> > > > > > > On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor
> > > > > > > <[hidden email] (http://gmail.com)> wrote:
> > > > > > > > Yup. I've done direct queries for a document that should be
> > > > > > > > there, and
> > > > > > > > even 30 minutes later, it is still not available.
>
> > > > > > > > based on the semi-regular pattern of these delays, I'm
> > > > > > > > wondering if
> > > > > > > > there's some kind of memory or gc issue playing up?
>
> > > > > > > > we have two nodes with 16gb/32 on the first, and 10/24 on
> > > > > > > > the second.
>
> > > > > > > > On Jan 20, 10:06 am, Shay Banon <[hidden email]
> > > > > > > > (http://gmail.com)> wrote:
> > > > > > > > > It makes little sense to use query_string as a filter, I
> > > > > > > > > suggest you
>
> > > > > > > > don't
> > > > > > > > > do that. But, even when using it as a filter, you should
> > > > > > > > > still see
>
> > > > > > > > changes.
> > > > > > > > > Can you verify its not the query? i.e. just search for a
> > > > > > > > > document
>
> > > > > > > > recently
> > > > > > > > > added and see if you get it back?
>
> > > > > > > > > On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor
> > > > > > > > > <[hidden email] (http://gmail.com)>
> > > > > > > > wrote:
> > > > > > > > > > also, its probably worth sharing my frontend's query:
>
> > > > > > > > > > {
> > > > > > > > > >  "filter" : {
> > > > > > > > > >    "and" : [
> > > > > > > > > >      {
> > > > > > > > > >        "term": {
> > > > > > > > > >          "SID": $num
> > > > > > > > > >        }
> > > > > > > > > >      },
> > > > > > > > > >      {
> > > > > > > > > >        "query": {
> > > > > > > > > >          "query_string" : {
> > > > > > > > > >            "default_operator" : "AND",
> > > > > > > > > >            "fields": ["X","Y"],
> > > > > > > > > >            "query" : "$QUERY"
> > > > > > > > > >          }
> > > > > > > > > >        }
> > > > > > > > > >      }
> > > > > > > > > >    ]
> > > > > > > > > >  },
> > > > > > > > > >  "sort" : [
> > > > > > > > > >    {
> > > > > > > > > >      "Y" : {
> > > > > > > > > >        "order" : "desc"
> > > > > > > > > >      }
> > > > > > > > > >    }
> > > > > > > > > >  ],
> > > > > > > > > >  "size" : 1
> > > > > > > > > > }'
>
> > > > > > > > > > I understand that there is no caching involved with the
> > > > > > > > > > AND filter,
> > > > > > > > > > but sort is a different matter (Y is a date)
>
> > > > > > > > > > On Jan 20, 12:21 am, Oren Mazor <[hidden email]
> > > > > > > > > > (http://gmail.com)> wrote:
> > > > > > > > > > > yup. I can see an insertion request going into ES (but
> > > > > > > > > > > not the
> > > > > > > > > > > response. now that I think of it), but running my
> > > > > > > > > > > query shows no
> > > > > > > > > > > record is available for that item.
>
> > > > > > > > > > > all of our records are virtually the same size (about
> > > > > > > > > > > 1kb), and the
> > > > > > > > > > > most insertions we'd be seeing is 10-20 per second.
> > > > > > > > > > > occasionally that
> > > > > > > > > > > might go up to 50.
>
> > > > > > > > > > > how often does refresh happen by default, and how long
> > > > > > > > > > > does it take?
>
> > > > > > > > > > > I'm wondering if 10 shards is not enough for the size
> > > > > > > > > > > of our index.
>
> > > > > > > > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email]
> > > > > > > > > > > (http://gmail.com)> wrote:
>
> > > > > > > > > > > > Does this happen with search request, where you see
> > > > > > > > > > > > the old data?
> > > > > > > > By
> > > > > > > > > > > > default, elasticsearch will refresh an index to see
> > > > > > > > > > > > newly indexed
>
> > > > > > > > docs
> > > > > > > > > > (or
> > > > > > > > > > > > deletes) every seconds. Can you use the index stats
> > > > > > > > > > > > API to see if
>
> > > > > > > > > > there was
> > > > > > > > > > > > a bump in how long it took to refresh (there is
> > > > > > > > > > > > refresh stats
>
> > > > > > > > there).
>
> > > > > > > > > > > > On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor
> > > > > > > > > > > > <[hidden email] (http://gmail.com)>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > Hi all,
>
> > > > > > > > > > > > > We've deployed elasticsearch in our production and
> > > > > > > > > > > > > we're
> > > > > > > > incredibly
> > > > > > > > > > > > > happy with search performance. However, we're
> > > > > > > > > > > > > seeing occasional
>
> ...
>
> read more »



---------------------
Medcl
http://log.medcl.net
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

Oren Mazor
Yup. the bulk operations are all okay, at least as far as the http
response is concerned.

I'm almost certain that my problem is just that we're hitting some
resource limit for the size of our index (40gb), but I cant figure out
where to find the blockage. I'm watching the stats on the cluster and
seeing nothing other than flat/healthy usage.

I am seeing a higher than normal read/write activity over the past 24
hours (huge number of documents added)

On Feb 9, 9:15 pm, <[hidden email]> wrote:

> hi,OrenMazor
> did you checked the response of the bulk  operation,are they all successful
> indexed?
> and also check your translog status.
> also manually refresh the index and to see if can get the current version
>
>
>
>
>
>
>
> -----Original Message-----
> From:OrenMazor
> Sent: Friday, February 10, 2012 2:25 AM
> To: elasticsearch
> Subject: Re: ES Index performance
>
> Issuing a Get does not return the correct version.
>
> we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
> have 1 replica and 10 shards.
>
> I'm suspecting an IO issue, to be honest, given that if there were
> indexing performance issues somebody may have seen them already, what
> with "select isn't broken" :)
>
> that said, we do have a pretty high rate of indexes, so maybe at scale
> some issue pops up ?
>
> On Feb 7, 2:00 pm, Shay Banon <[hidden email]> wrote:
> > Going back to your question, do you see that issuing a Get (which is
> > realtime) does not return the correct version of the data? I would be
> > helpful to understand where the stalling is coming from. If a "get" does
> > not return your expect version of the data, it means that it didn't get
> > indexed, so you will need to look at the indexer code and see if maybe
> > something is stalling on the bulk API execution.
>
> > The stalling can be for many reasons, starting with slow IO, not enough
> > resources on the machine CPU/Mem, overloading the machines you have in the
> > cluster, GC .
>
> > Which JVM version are you using? Are you running on EC2? If so, which
> > instances / os version? How many shards do you have in the index?
>
> > On Tuesday, February 7, 2012 at 5:04 PM,OrenMazor wrote:
> > > hi Shay,
>
> > > we use server density to keep track of ES and I'm not seeing any
> > > spikes in resource use at all. I'm suspecting we're just pushing it
> > > more than most people do?
>
> > > I do use the bulk API to send perhaps 1000 insertions every 10 seconds
> > > on average. in some cases these are new records, and sometimes they
> > > are new versions of existing records. I also send some amount of
> > > deletes every minute, but these are not really using the bulk api.
>
> > > could you recommend a more low level primer on ES? I'd love to have a
> > > more low level understanding of how/why things work. it'll make it
> > > easier for me to tune my algorithms. I can go through the source, but
> > > if there're some papers out there I could read, that'd be better :)
>
> > > thanks!
> > >Oren
>
> > > PS. I noticed there is now a mongodb river in development. I'm
> > > wondering whether my efforts might be better spent helping it to
> > > production status rather than trying to tune my own code..
>
> > > On Feb 5, 4:31 pm, Shay Banon <[hidden email] (http://gmail.com)>
> > > wrote:
> > > > On Saturday, February 4, 2012 at 3:18 AM,OrenMazor wrote:
> > > > > so, I'm starting to see these again on heavy really heavy load (lets
> > > > > say around 10k insertions a minute)
>
> > > > Whats the behavior of elasticsearch in this case? Memory usage ok?
> > > > When you say 10k inserts per minute, is that using the bulk API?  How
> > > > many clients are indexing the data?
>
> > > > > I'm still having some difficulty wrapping my head around the
> > > > > algorithm
> > > > > in the bottom end. the refresh total_time is 17h and merges is
> > > > > 14.9h.
> > > > > this seems pretty ambiguous. I'm guessing its the total time spent
> > > > > executing these actions rather than the time since, right?
>
> > > > Yes, thats the total time that was spent doing it.
>
> > > > > are there some hardware settings I can make to make lucene go
> > > > > faster?
> > > > > also, is there anything I can read to level up on understanding the
> > > > > low level side of things? I'm going through the ES code to start
> > > > > with
> > > > > and learning more there.
>
> > > > > On Jan 25, 9:37 am, Shay Banon <[hidden email] (http://gmail.com)>
> > > > > wrote:
> > > > > > Great, thanks for the update!
>
> > > > > > On Wednesday, January 25, 2012 at 8:54 AM,OrenMazor wrote:
> > > > > > > Hi Shay,
>
> > > > > > > just a follow up (because I hate it when there is no closure).
>
> > > > > > > I modified my import script to use bulk imports, so instead of
> > > > > > > 10
> > > > > > > insertions a second, I now end up doing one bulk insertion every
> > > > > > > ten
> > > > > > > seconds. I had it up to a minute, but I think inserting 600-800
> > > > > > > records in one bulk request was causing some problems, so I
> > > > > > > shortened
> > > > > > > the frequency.
>
> > > > > > > so far I'm not seeeing any serious delays in testing this week,
> > > > > > > but
> > > > > > > tomorrow I'll do some bigger load testing with our big index. it
> > > > > > > seems
> > > > > > > promising at the moment!
>
> > > > > > > On Jan 20, 2:26 pm, Shay Banon <[hidden email]
> > > > > > > (http://gmail.com)> wrote:
> > > > > > > > Hard to tell if its GC, you can monitor it using bigdesk to
> > > > > > > > see changes,
> > > > > > > > see how memory is behaving. Though you way you have a 30
> > > > > > > > minute "pause",
> > > > > > > > which is strange. Did you check the refresh stats? Also, when
> > > > > > > > this happens,
> > > > > > > > can you simply get by id the relevant new / modified document?
>
> > > > > > > > On Fri, Jan 20, 2012 at 5:58 PM,OrenMazor
> > > > > > > > <[hidden email] (http://gmail.com)> wrote:
> > > > > > > > > Yup. I've done direct queries for a document that should be
> > > > > > > > > there, and
> > > > > > > > > even 30 minutes later, it is still not available.
>
> > > > > > > > > based on the semi-regular pattern of these delays, I'm
> > > > > > > > > wondering if
> > > > > > > > > there's some kind of memory or gc issue playing up?
>
> > > > > > > > > we have two nodes with 16gb/32 on the first, and 10/24 on
> > > > > > > > > the second.
>
> > > > > > > > > On Jan 20, 10:06 am, Shay Banon <[hidden email]
> > > > > > > > > (http://gmail.com)> wrote:
> > > > > > > > > > It makes little sense to use query_string as a filter, I
> > > > > > > > > > suggest you
>
> > > > > > > > > don't
> > > > > > > > > > do that. But, even when using it as a filter, you should
> > > > > > > > > > still see
>
> > > > > > > > > changes.
> > > > > > > > > > Can you verify its not the query? i.e. just search for a
> > > > > > > > > > document
>
> > > > > > > > > recently
> > > > > > > > > > added and see if you get it back?
>
> > > > > > > > > > On Fri, Jan 20, 2012 at 8:07 AM,OrenMazor
> > > > > > > > > > <[hidden email] (http://gmail.com)>
> > > > > > > > > wrote:
> > > > > > > > > > > also, its probably worth sharing my frontend's query:
>
> > > > > > > > > > > {
> > > > > > > > > > >  "filter" : {
> > > > > > > > > > >    "and" : [
> > > > > > > > > > >      {
> > > > > > > > > > >        "term": {
> > > > > > > > > > >          "SID": $num
> > > > > > > > > > >        }
> > > > > > > > > > >      },
> > > > > > > > > > >      {
> > > > > > > > > > >        "query": {
> > > > > > > > > > >          "query_string" : {
> > > > > > > > > > >            "default_operator" : "AND",
> > > > > > > > > > >            "fields": ["X","Y"],
> > > > > > > > > > >            "query" : "$QUERY"
> > > > > > > > > > >          }
> > > > > > > > > > >        }
> > > > > > > > > > >      }
> > > > > > > > > > >    ]
> > > > > > > > > > >  },
> > > > > > > > > > >  "sort" : [
> > > > > > > > > > >    {
> > > > > > > > > > >      "Y" : {
> > > > > > > > > > >        "order" : "desc"
> > > > > > > > > > >      }
> > > > > > > > > > >    }
> > > > > > > > > > >  ],
> > > > > > > > > > >  "size" : 1
> > > > > > > > > > > }'
>
> > > > > > > > > > > I understand that there is no caching involved with the
> > > > > > > > > > > AND filter,
> > > > > > > > > > > but sort is a different matter (Y is a date)
>
> > > > > > > > > > > On Jan 20, 12:21 am,OrenMazor <[hidden email]
> > > > > > > > > > > (http://gmail.com)> wrote:
> > > > > > > > > > > > yup. I can see an insertion request going into ES (but
> > > > > > > > > > > > not the
> > > > > > > > > > > > response. now that I think of it), but running my
> > > > > > > > > > > > query shows no
> > > > > > > > > > > > record is available for that item.
>
> > > > > > > > > > > > all of our records are virtually the same size (about
> > > > > > > > > > > > 1kb), and the
> > > > > > > > > > > > most insertions we'd be seeing is 10-20 per second.
> > > > > > > > > > > > occasionally that
> > > > > > > > > > > > might go up to 50.
>
> > > > > > > > > > > > how often does refresh happen by default, and how long
> > > > > > > > > > > > does it take?
>
> > > > > > > > > > > > I'm wondering if 10 shards is not enough for the size
> > > > > > > > > > > > of our index.
>
> > > > > > > > > > > > On Jan 18, 4:13 pm, Shay Banon <[hidden email]
> > > > > > > > > > > > (http://gmail.com)> wrote:
>
> > > > > > > > > > > > > Does this happen with search request, where you see
> > > > > > > > > > > > > the old data?
> > > > > > > > > By
> > > > > > > > > > > > > default, elasticsearch will refresh an index to see
> > > > > > > > > > > > > newly indexed
>
> > > > > > > > > docs
> > > > > > > > > > > (or
> > > > > > > > > > > > > deletes) every seconds. Can you use the index stats
> > > > > > > > > > > > > API to see if
>
> > > > > > > > > > > there was
> > > > > > > > > > > > > a bump in how long it took to refresh (there is
> > > > > > > > > > > > > refresh stats
>
> > > > > > > > > there).
>
> > > > > > > > > > > > > On Wed, Jan 18, 2012 at 8:15 AM,OrenMazor
> > > > > > > > > > > > > <[hidden email] (http://gmail.com)>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > Hi all,
>
> > > > > > > > > > > > > > We've deployed elasticsearch in our production and
> > > > > > > > > > > > > > we're
> > > > > > > > > incredibly
> > > > > > > > > > > > > > happy with search performance. However, we're
> > > > > > > > > > > > > > seeing occasional
>
> > ...
>
> > read more
Reply | Threaded
Open this post in threaded view
|

Re: ES Index performance

kimchy
Administrator
The bulk request returns a response per item if it succeeded or not (and if failed, the failure itself), so you need to check the actual response body. Also, can you try and use a newer Java version, the one you use is pretty old.

On Saturday, February 11, 2012 at 11:47 PM, Oren Mazor wrote:

Yup. the bulk operations are all okay, at least as far as the http
response is concerned.

I'm almost certain that my problem is just that we're hitting some
resource limit for the size of our index (40gb), but I cant figure out
where to find the blockage. I'm watching the stats on the cluster and
seeing nothing other than flat/healthy usage.

I am seeing a higher than normal read/write activity over the past 24
hours (huge number of documents added)

On Feb 9, 9:15 pm, <medcl2...@gmail.com> wrote:
hi,OrenMazor
did you checked the response of the bulk  operation,are they all successful
indexed?
and also check your translog status.
also manually refresh the index and to see if can get the current version







-----Original Message-----
From:OrenMazor
Sent: Friday, February 10, 2012 2:25 AM
To: elasticsearch
Subject: Re: ES Index performance

Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :)

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops up ?

On Feb 7, 2:00 pm, Shay Banon <kim...@gmail.com> wrote:
Going back to your question, do you see that issuing a Get (which is
realtime) does not return the correct version of the data? I would be
helpful to understand where the stalling is coming from. If a "get" does
not return your expect version of the data, it means that it didn't get
indexed, so you will need to look at the indexer code and see if maybe
something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough
resources on the machine CPU/Mem, overloading the machines you have in the
cluster, GC .

Which JVM version are you using? Are you running on EC2? If so, which
instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM,OrenMazor wrote:
hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :)

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <[hidden email] (http://gmail.com)>
wrote:
On Saturday, February 4, 2012 at 3:18 AM,OrenMazor wrote:
so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok?
When you say 10k inserts per minute, is that using the bulk API?  How
many clients are indexing the data?

I'm still having some difficulty wrapping my head around the
algorithm
in the bottom end. the refresh total_time is 17h and merges is
14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go
faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start
with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <[hidden email] (http://gmail.com)>
wrote:
Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM,OrenMazor wrote:
Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of
10
insertions a second, I now end up doing one bulk insertion every
ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I
shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week,
but
tomorrow I'll do some bigger load testing with our big index. it
seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com
Hard to tell if its GC, you can monitor it using bigdesk to
see changes,
see how memory is behaving. Though you way you have a 30
minute "pause",
which is strange. Did you check the refresh stats? Also, when
this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM,OrenMazor
Yup. I've done direct queries for a document that should be
there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm
wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on
the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com
It makes little sense to use query_string as a filter, I
suggest you

don't
do that. But, even when using it as a filter, you should
still see

changes.
Can you verify its not the query? i.e. just search for a
document

recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM,OrenMazor
wrote:
also, its probably worth sharing my frontend's query:

{
 "filter" : {
   "and" : [
     {
       "term": {
         "SID": $num
       }
     },
     {
       "query": {
         "query_string" : {
           "default_operator" : "AND",
           "fields": ["X","Y"],
           "query" : "$QUERY"
         }
       }
     }
   ]
 },
 "sort" : [
   {
     "Y" : {
       "order" : "desc"
     }
   }
 ],
 "size" : 1
}'

I understand that there is no caching involved with the
AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am,OrenMazor <oren.ma...@gmail.com
yup. I can see an insertion request going into ES (but
not the
response. now that I think of it), but running my
query shows no
record is available for that item.

all of our records are virtually the same size (about
1kb), and the
most insertions we'd be seeing is 10-20 per second.
occasionally that
might go up to 50.

how often does refresh happen by default, and how long
does it take?

I'm wondering if 10 shards is not enough for the size
of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com

Does this happen with search request, where you see
the old data?
By
default, elasticsearch will refresh an index to see
newly indexed

docs
(or
deletes) every seconds. Can you use the index stats
API to see if

there was
a bump in how long it took to refresh (there is
refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM,OrenMazor
wrote:
Hi all,

We've deployed elasticsearch in our production and
we're
incredibly
happy with search performance. However, we're
seeing occasional

...

read more

12