Difference between db and elasticsearch

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Difference between db and elasticsearch

Mohit Anchlia
Please throw some light on this because I am unable to think of
reasons why elasticsearch shouldn't just be used as real time
database. Since with DB you need to write and then for most part run
queries on them. So if elasticsearch is scalable and distributed then
why can't elasticsearch be used as db itself? I am trying to
understand the disadvantages of thinking elasticsearch as database.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Otis Gospodnetic
Hi Mohit,

You are not the first person to wonder about this.  Check the ML
archives.  Yes, some people use ES and similar systems in place of
technologies with "database" in their names.

Otis

On Jan 4, 8:00 pm, Mohit Anchlia <[hidden email]> wrote:
> Please throw some light on this because I am unable to think of
> reasons why elasticsearch shouldn't just be used as real time
> database. Since with DB you need to write and then for most part run
> queries on them. So if elasticsearch is scalable and distributed then
> why can't elasticsearch be used as db itself? I am trying to
> understand the disadvantages of thinking elasticsearch as database.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Douglas Muth
In reply to this post by Mohit Anchlia
On Wed, Jan 4, 2012 at 8:00 PM, Mohit Anchlia <[hidden email]> wrote:
> Please throw some light on this because I am unable to think of
> reasons why elasticsearch shouldn't just be used as real time
> database. Since with DB you need to write and then for most part run
> queries on them. So if elasticsearch is scalable and distributed then
> why can't elasticsearch be used as db itself? I am trying to
> understand the disadvantages of thinking elasticsearch as database.

It's been my experience much easier to do some things in databases
that you can't easily in ES, such GROUP BY and especially ORDER BY
operations.

If you're looking for a database that does sharding/replication, check
out Cassandra.

-- Doug
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

nurikabe
We are experimenting with ES as a database of sorts for documents.  I wouldn't want to use it for complex relational data, but it's great for documents and corresponding metadata.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Ivan Brusic
It all depends on what your requirements are.

Despite all the advances that Lucene has made (especially with the
latest 3.5 release), Lucene (and therefore ElasticSearch) is still not
realtime. Not everyone needs access to the latest commit, so
ElasticSearch works great in those scenarios.

Just like most other NoSQL systems, you lose many RDBMS benefits such
as transactions and joining between tables (types/docs/whatever). The
loss of atomic updates is a big hurdle for many. Working with
documents however, can be very liberating after being stuck in the
relation world for years.

--
Ivan

On Thu, Jan 5, 2012 at 7:15 AM, nurikabe <[hidden email]> wrote:
> We are experimenting with ES as a database of sorts for documents.  I
> wouldn't want to use it for complex relational data, but it's great for
> documents and corresponding metadata.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Stanislas Polu
Can someone elaborate on the realtime limitations of elastic search?

My understanding was that it was pretty good at it. And that's been verified for me so far.

Cheers

-stan

On Jan 5, 2012, at 7:24 PM, Ivan Brusic <[hidden email]> wrote:

> It all depends on what your requirements are.
>
> Despite all the advances that Lucene has made (especially with the
> latest 3.5 release), Lucene (and therefore ElasticSearch) is still not
> realtime. Not everyone needs access to the latest commit, so
> ElasticSearch works great in those scenarios.
>
> Just like most other NoSQL systems, you lose many RDBMS benefits such
> as transactions and joining between tables (types/docs/whatever). The
> loss of atomic updates is a big hurdle for many. Working with
> documents however, can be very liberating after being stuck in the
> relation world for years.
>
> --
> Ivan
>
> On Thu, Jan 5, 2012 at 7:15 AM, nurikabe <[hidden email]> wrote:
>> We are experimenting with ES as a database of sorts for documents.  I
>> wouldn't want to use it for complex relational data, but it's great for
>> documents and corresponding metadata.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Ivan Brusic
There is a reason why Lucene is calling it "near real time" and not
real time. :)

Lucene is good at it, but there are certain scenarios were a commit
needs to be propagated immediately, which Lucene cannot handle.

http://blog.mikemccandless.com/2011/06/lucenes-near-real-time-search-is-fast.html

--
Ivan

On Thu, Jan 5, 2012 at 10:54 AM, Stanislas Polu
<[hidden email]> wrote:

> Can someone elaborate on the realtime limitations of elastic search?
>
> My understanding was that it was pretty good at it. And that's been verified for me so far.
>
> Cheers
>
> -stan
>
> On Jan 5, 2012, at 7:24 PM, Ivan Brusic <[hidden email]> wrote:
>
>> It all depends on what your requirements are.
>>
>> Despite all the advances that Lucene has made (especially with the
>> latest 3.5 release), Lucene (and therefore ElasticSearch) is still not
>> realtime. Not everyone needs access to the latest commit, so
>> ElasticSearch works great in those scenarios.
>>
>> Just like most other NoSQL systems, you lose many RDBMS benefits such
>> as transactions and joining between tables (types/docs/whatever). The
>> loss of atomic updates is a big hurdle for many. Working with
>> documents however, can be very liberating after being stuck in the
>> relation world for years.
>>
>> --
>> Ivan
>>
>> On Thu, Jan 5, 2012 at 7:15 AM, nurikabe <[hidden email]> wrote:
>>> We are experimenting with ES as a database of sorts for documents.  I
>>> wouldn't want to use it for complex relational data, but it's great for
>>> documents and corresponding metadata.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Karussell
In reply to this post by Douglas Muth
Here are some links (it all depends on your requirements):

http://stackoverflow.com/questions/6636508/elastic-search-as-a-database

> It's been my experience much easier to do some things in databases
> that you can't easily in ES, such GROUP BY and especially ORDER BY
> operations.


What is the problem with ORDER BY and ElasticSearch?

Peter.
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Douglas Muth
On Thu, Jan 5, 2012 at 4:43 PM, Karussell <[hidden email]> wrote:
>
> What is the problem with ORDER BY and ElasticSearch?
>

Sorting by an integer?  Not a problem.  Sorting by something that's a
single word?  Not a problem.  Sorting by a field that has multiple
words?  Problem.

As I understand it, if you index the following string: "the quick fox
jumps over the lazy cheetah", Elastic Search analyzes it and stores it
like this:

["the", "quick", "fox", "jumps", "over", "the", "lazy", "cheetah"]

Kinda hard to sort an array, no? :-)

Now, there are some workarounds, such as not analyzing that field, or
storing an "untouched" version of that field alongside of the analyzed
version.  However, the former method means you can't search on that
field and the latter method means more disk space is used up.

At least, I'm about 90% sure that this is how it works, as I just
dealt with this issue for the first time uh, 2 days ago.  If I'm
horribly wrong, someone please correct me. :-P

-- Doug
http://twitter.com/dmuth
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Lukáš Vlček
Hi,

Did you check script based sorting? That means, analyze the field, but sort based on the original _source value of that field.


Regards,
Lukas

On Thu, Jan 5, 2012 at 10:53 PM, Douglas Muth <[hidden email]> wrote:
On Thu, Jan 5, 2012 at 4:43 PM, Karussell <[hidden email]> wrote:
>
> What is the problem with ORDER BY and ElasticSearch?
>

Sorting by an integer?  Not a problem.  Sorting by something that's a
single word?  Not a problem.  Sorting by a field that has multiple
words?  Problem.

As I understand it, if you index the following string: "the quick fox
jumps over the lazy cheetah", Elastic Search analyzes it and stores it
like this:

["the", "quick", "fox", "jumps", "over", "the", "lazy", "cheetah"]

Kinda hard to sort an array, no? :-)

Now, there are some workarounds, such as not analyzing that field, or
storing an "untouched" version of that field alongside of the analyzed
version.  However, the former method means you can't search on that
field and the latter method means more disk space is used up.

At least, I'm about 90% sure that this is how it works, as I just
dealt with this issue for the first time uh, 2 days ago.  If I'm
horribly wrong, someone please correct me. :-P

-- Doug
http://twitter.com/dmuth

Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

plaflamme
In reply to this post by Douglas Muth
And how would you expect a database to sort rows with a column containing "the quick brown fox..."?

Isn't this what "scoring" results is about? Using term frequencies and other goodies...?

Philippe

On Thu, Jan 5, 2012 at 16:53, Douglas Muth <[hidden email]> wrote:
On Thu, Jan 5, 2012 at 4:43 PM, Karussell <[hidden email]> wrote:
>
> What is the problem with ORDER BY and ElasticSearch?
>

Sorting by an integer?  Not a problem.  Sorting by something that's a
single word?  Not a problem.  Sorting by a field that has multiple
words?  Problem.

As I understand it, if you index the following string: "the quick fox
jumps over the lazy cheetah", Elastic Search analyzes it and stores it
like this:

["the", "quick", "fox", "jumps", "over", "the", "lazy", "cheetah"]

Kinda hard to sort an array, no? :-)

Now, there are some workarounds, such as not analyzing that field, or
storing an "untouched" version of that field alongside of the analyzed
version.  However, the former method means you can't search on that
field and the latter method means more disk space is used up.

At least, I'm about 90% sure that this is how it works, as I just
dealt with this issue for the first time uh, 2 days ago.  If I'm
horribly wrong, someone please correct me. :-P

-- Doug
http://twitter.com/dmuth

Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

Douglas Muth
On Thu, Jan 5, 2012 at 7:38 PM, Philippe Laflamme
<[hidden email]> wrote:
> And how would you expect a database to sort rows with a column containing
> "the quick brown fox..."?
>

Alphabetically, of course.

The issue I ran into the other day was trying to sort results by the
name of a venue, ignoring what the score was.  Easily enough done in a
traditional SQL database, but a little more difficult in Elastic
Search.  (Of course, this meant completely disregarding the scoring
the results...)

-- Doug
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

medcl.net
anybody did a performance test with the script based sorting?
i am wondering if sorting on a large dataset with script will be very slow

-----Original Message-----
From: Douglas Muth
Sent: Friday, January 06, 2012 8:44 AM
To: [hidden email]
Subject: Re: Difference between db and elasticsearch

On Thu, Jan 5, 2012 at 7:38 PM, Philippe Laflamme
<[hidden email]> wrote:
> And how would you expect a database to sort rows with a column containing
> "the quick brown fox..."?
>

Alphabetically, of course.

The issue I ran into the other day was trying to sort results by the
name of a venue, ignoring what the score was.  Easily enough done in a
traditional SQL database, but a little more difficult in Elastic
Search.  (Of course, this meant completely disregarding the scoring
the results...)

-- Doug


---------------------
Medcl
http://log.medcl.net
Reply | Threaded
Open this post in threaded view
|

Re: Difference between db and elasticsearch

kimchy
Administrator
In reply to this post by Douglas Muth
You can always mark the field as not analyzed in the mapping (or use multi field mapping for analyzed and not analyzed versions) and then sort based on it.

On Fri, Jan 6, 2012 at 2:44 AM, Douglas Muth <[hidden email]> wrote:
On Thu, Jan 5, 2012 at 7:38 PM, Philippe Laflamme
<[hidden email]> wrote:
> And how would you expect a database to sort rows with a column containing
> "the quick brown fox..."?
>

Alphabetically, of course.

The issue I ran into the other day was trying to sort results by the
name of a venue, ignoring what the score was.  Easily enough done in a
traditional SQL database, but a little more difficult in Elastic
Search.  (Of course, this meant completely disregarding the scoring
the results...)

-- Doug