Design help: What do I gain with Redis?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Design help: What do I gain with Redis?

thealy
Running Elasticsearch version 0.90.3 on 10 nodes. Two indexes in use with replication 4, shards 5. Just over 200 million records total so far. Records are email and http proxy log information, sent to dedicated Java socket based apps directly from syslog-ng.

I've been using a TCP feed from a central logging machine, to home-grown apps that parses and combines records before inserting them in ES via the Java API. I'm about to add several new feeds using  the same plan. But I'm wondering about the role that Redis played in the initial test installation. Should I be feeding the log stream via Redis? What do I gain? Buffering? And if I add it into the flow, should it be before or after my parsing application?

Thanks for any help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|

Re: Design help: What do I gain with Redis?

joergprante@gmail.com
Not sure about your requirements. And good answers to Redis are rare on an ES list. From my understanding, Redis is not really a buffer, it is a persistent queue, and supports a babylonian plentitude of clients. So if you want a (central) store with transactional data flow, available for post processing beside ES, with your devs and sysops loving polyglot architectures, Redis may be the answer.

If you want just collect and index log messages for yourself, much simpler setups can be imagined. Solutions like logstash or rsyslog are available. As you are also like to grow your own apps at home, there is nothing to prevent you from writing a plugin for ES that starts a syslog daemon, parse the log messages, and index them ;-)

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|

Re: Design help: What do I gain with Redis?

Mark Walkom
Redis isn't persistent, see http://en.wikipedia.org/wiki/Redis#Persistence for details on what it means in redis terms, but the default behaviour of redis is not to store anything out of memory.

It's used primarily as an in memory, temporary key/value store that provides very fast access. Logstash's recommended setup uses redis as a queuing service between it's various instances, however it used to use an AMQ system, rabbitmq, which provides actual persistence. I'm not sure why they swapped though.
If you are taking syslog-ng data and then parsing it into a custom java app, then Logstash might be a viable alternative as it will save you a bunch of work.

All that aside, can you clarify what you mean by "I'm wondering about the role that Redis played in the initial test installation", as that isn't really explained. Why did you install redis originally?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: [hidden email]
web: www.campaignmonitor.com


On 30 October 2013 08:03, [hidden email] <[hidden email]> wrote:
Not sure about your requirements. And good answers to Redis are rare on an ES list. From my understanding, Redis is not really a buffer, it is a persistent queue, and supports a babylonian plentitude of clients. So if you want a (central) store with transactional data flow, available for post processing beside ES, with your devs and sysops loving polyglot architectures, Redis may be the answer.

If you want just collect and index log messages for yourself, much simpler setups can be imagined. Solutions like logstash or rsyslog are available. As you are also like to grow your own apps at home, there is nothing to prevent you from writing a plugin for ES that starts a syslog daemon, parse the log messages, and index them ;-)

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|

Re: Design help: What do I gain with Redis?

joergprante@gmail.com
To clarify, I'm a Redis user, so Redis is persistent of course, from the config file: "By default Redis asynchronously dumps the dataset on disk." You can even choose between different persistence modes if you want trade performance. And yes, I do kill the server, or on reboot, all data is there between restarts.

Redis can be configured for higher throughput than RabbitMQ, this might be the reason to recommend Redis by logstash.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|

Re: Design help: What do I gain with Redis?

shadyabhi
Yes, it's better performing than RabbitMQ. Most people in logstash use
Redis as a queue. It helps in building the infra with multiple
producer, multiple consumer model. So, if the task of creating a JSON
is CPU intensive and one machine can't handle, you can start doing it
on 2 machines.

Hope it helps.

On Wed, Oct 30, 2013 at 1:56 PM, [hidden email]
<[hidden email]> wrote:

> To clarify, I'm a Redis user, so Redis is persistent of course, from the
> config file: "By default Redis asynchronously dumps the dataset on disk."
> You can even choose between different persistence modes if you want trade
> performance. And yes, I do kill the server, or on reboot, all data is there
> between restarts.
>
> Redis can be configured for higher throughput than RabbitMQ, this might be
> the reason to recommend Redis by logstash.
>
> Jörg
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.



--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|

Re: Design help: What do I gain with Redis?

thealy
Thank you all for your feedback about using Redis with ES. It seems that
for my application it will be appropriate to add in when I have either
multiple consumers or need to apply multiple servers to parse a given
log stream for insertaion as JSON.

On 10/30/2013 05:25 AM, Abhijeet Rastogi wrote:

> Yes, it's better performing than RabbitMQ. Most people in logstash use
> Redis as a queue. It helps in building the infra with multiple
> producer, multiple consumer model. So, if the task of creating a JSON
> is CPU intensive and one machine can't handle, you can start doing it
> on 2 machines.
>
> Hope it helps.
>
> On Wed, Oct 30, 2013 at 1:56 PM, [hidden email]
> <[hidden email]> wrote:
>> To clarify, I'm a Redis user, so Redis is persistent of course, from the
>> config file: "By default Redis asynchronously dumps the dataset on disk."
>> You can even choose between different persistence modes if you want trade
>> performance. And yes, I do kill the server, or on reboot, all data is there
>> between restarts.
>>
>> Redis can be configured for higher throughput than RabbitMQ, this might be
>> the reason to recommend Redis by logstash.
>>
>> Jörg
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [hidden email].
>> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.