Using ES as a primary datastore.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Using ES as a primary datastore.

P Suman
Hello,

 We are planning to use ES as a primary datastore. 

Here is my usecase

We receive a million transactions per day (all are inserts). 
Each transaction is around 500KB size, transaction has 10 fields we should be able to search on all 10 fields. 
We want to keep around 1 yr worth of data, this comes around 180TB

Can you please let me know any problems that might arise if i use elastic search as the primary datastore.



Regards,
Suman




--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f52b61e2-0955-4e79-8bb8-61c9428c67d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Using ES as a primary datastore.

Mark Walkom
That's a lot of data, do you have a big budget, automation, monitoring?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com

On 17 September 2014 20:41, P Suman <[hidden email]> wrote:
Hello,

 We are planning to use ES as a primary datastore. 

Here is my usecase

We receive a million transactions per day (all are inserts). 
Each transaction is around 500KB size, transaction has 10 fields we should be able to search on all 10 fields. 
We want to keep around 1 yr worth of data, this comes around 180TB

Can you please let me know any problems that might arise if i use elastic search as the primary datastore.



Regards,
Suman




--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f52b61e2-0955-4e79-8bb8-61c9428c67d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YHED333JiG8Jb8X8h41HF64xGze-ZoJKrt2R3fxxqn_A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Using ES as a primary datastore.

Thomas Bolis
In reply to this post by P Suman
Hi,

You have to calculate the volumes you will keep in one shard first then you have to break your volumes into the number of shards you will maintain and then scale accordingly into a number of nodes, or at least as your volumes grow you should grow your cluster as well.

It is difficult to predict what problems may arise it is too generic your case, what will be the usage of the cluster? what queries you will perform, you will mostly do indexing and occasionally querying or you will intensively query your data.

Most important you need to  think how you will partition your data, will you have one index, multiple index like a logstash approach? or not
Maybe check here: https://www.found.no/foundation/sizing-elasticsearch/

For data more than a year what you will do delete them? Do you afford to lose data? Will you keep backups?

IMHO, these are some of the questions you must answer in order to see whether such an approach suit your needs. It is hardware, structure and partitioning of your data.

Thomas

On Wednesday, 17 September 2014 13:41:55 UTC+3, P Suman wrote:
Hello,

 We are planning to use ES as a primary datastore. 

Here is my usecase

We receive a million transactions per day (all are inserts). 
Each transaction is around 500KB size, transaction has 10 fields we should be able to search on all 10 fields. 
We want to keep around 1 yr worth of data, this comes around 180TB

Can you please let me know any problems that might arise if i use elastic search as the primary datastore.



Regards,
Suman




--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0612d5d3-05df-4538-a3f0-e87cd9b3dc49%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Using ES as a primary datastore.

Alex Kamil
ES is a fantastic search engine but there is some risk of data loss, and a few other potential disadvantages which might or might not be relevant to you. You can always combine ES via JDBC river with a stable, secure database, e.g. Mysql or Hbase, since you have lots of data hbase might be a better option. 


On Wed, Sep 17, 2014 at 8:04 AM, Thomas <[hidden email]> wrote:
Hi,

You have to calculate the volumes you will keep in one shard first then you have to break your volumes into the number of shards you will maintain and then scale accordingly into a number of nodes, or at least as your volumes grow you should grow your cluster as well.

It is difficult to predict what problems may arise it is too generic your case, what will be the usage of the cluster? what queries you will perform, you will mostly do indexing and occasionally querying or you will intensively query your data.

Most important you need to  think how you will partition your data, will you have one index, multiple index like a logstash approach? or not

For data more than a year what you will do delete them? Do you afford to lose data? Will you keep backups?

IMHO, these are some of the questions you must answer in order to see whether such an approach suit your needs. It is hardware, structure and partitioning of your data.

Thomas

On Wednesday, 17 September 2014 13:41:55 UTC+3, P Suman wrote:
Hello,

 We are planning to use ES as a primary datastore. 

Here is my usecase

We receive a million transactions per day (all are inserts). 
Each transaction is around 500KB size, transaction has 10 fields we should be able to search on all 10 fields. 
We want to keep around 1 yr worth of data, this comes around 180TB

Can you please let me know any problems that might arise if i use elastic search as the primary datastore.



Regards,
Suman




--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0612d5d3-05df-4538-a3f0-e87cd9b3dc49%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOtKWX47iRi6P%2BSp-GC%2B8JL1xmwKoL4yHerMC4PG5rYDiL8YXA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

RE: Using ES as a primary datastore.

Doug Turnbull
In reply to this post by P Suman
I'd also suggest checking out DataStax Enterprise -- a commercial flavor of Cassandra. Its Cassandra, so update rates and volume are its strong suit. Its intended as a primary data store. It has a Solr (another search engine) instance on each node that indexes the local data on that node, enabling full text search. Solr is not nearly as user friendly as Elasticsearch, but otherwise there's a lot of comparable features depending on your search needs.

Doug

From: [hidden email]
Sent: ‎9/‎17/‎2014 8:48 AM
To: [hidden email]
Subject: Re: Using ES as a primary datastore.

ES is a fantastic search engine but there is some risk of data loss, and a few other potential disadvantages which might or might not be relevant to you. You can always combine ES via JDBC river with a stable, secure database, e.g. Mysql or Hbase, since you have lots of data hbase might be a better option. 


On Wed, Sep 17, 2014 at 8:04 AM, Thomas <[hidden email]> wrote:
Hi,

You have to calculate the volumes you will keep in one shard first then you have to break your volumes into the number of shards you will maintain and then scale accordingly into a number of nodes, or at least as your volumes grow you should grow your cluster as well.

It is difficult to predict what problems may arise it is too generic your case, what will be the usage of the cluster? what queries you will perform, you will mostly do indexing and occasionally querying or you will intensively query your data.

Most important you need to  think how you will partition your data, will you have one index, multiple index like a logstash approach? or not

For data more than a year what you will do delete them? Do you afford to lose data? Will you keep backups?

IMHO, these are some of the questions you must answer in order to see whether such an approach suit your needs. It is hardware, structure and partitioning of your data.

Thomas

On Wednesday, 17 September 2014 13:41:55 UTC+3, P Suman wrote:
Hello,

 We are planning to use ES as a primary datastore. 

Here is my usecase

We receive a million transactions per day (all are inserts). 
Each transaction is around 500KB size, transaction has 10 fields we should be able to search on all 10 fields. 
We want to keep around 1 yr worth of data, this comes around 180TB

Can you please let me know any problems that might arise if i use elastic search as the primary datastore.



Regards,
Suman




--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0612d5d3-05df-4538-a3f0-e87cd9b3dc49%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOtKWX47iRi6P%2BSp-GC%2B8JL1xmwKoL4yHerMC4PG5rYDiL8YXA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5188131938126812645%40unknownmsgid.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Using ES as a primary datastore.

Elvar Böðvarsson
In reply to this post by P Suman
Like others have mentioned, there is a risk of data loss at the moment but Elasticsearch is working on making it better and better.

I watched this video about Couchbase and Elasticsearch, https://www.youtube.com/watch?v=rpwtxpmuDb0

After that then I highly recommend researching Couchbase as an option.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bff701c2-7885-4e8b-8ddd-968172ab4492%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Using ES as a primary datastore.

Alex Kamil
in the spirit of this thread let me plug my favorite database here: Connecting Hbase to Elasticsearch in 10 min or less

On Mon, Nov 24, 2014 at 2:25 PM, Elvar Böðvarsson <[hidden email]> wrote:
Like others have mentioned, there is a risk of data loss at the moment but Elasticsearch is working on making it better and better.

I watched this video about Couchbase and Elasticsearch, https://www.youtube.com/watch?v=rpwtxpmuDb0

After that then I highly recommend researching Couchbase as an option.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bff701c2-7885-4e8b-8ddd-968172ab4492%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOtKWX6%3DiZ2pd7NEzYVM2vXhVJ1KpeoQy5oYh-qbVC6G6KTVEw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.