We have been running several production sites using ES as our primary data store on EC2 and using the S3 gateway for full cluster backup and recovery.
The deprecation of S3 and the local storage direction does not work for us. We run EC2 instances with no (miniscule) local FS, high memory and high IO. The reason EBS does not work for us is because we let Amazon Elastic Beanstalk control the creation/destruction of all our instances based on cluster load. We do not manage EBS instances.
The S3 gateway has been working perfectly for us, so I ask, why deprecate the shared gateway solutions? Are their technical problems with their implementations?
Thanks for the link. I guess my problem is the deprecation of a feature like this because of a use case where performance takes priority over infrastructure. While that may suit many (most?) users of ES, we don't have any issues with the performance of ES when using an S3 gateway, and the convenience of using it as a backup is unrivaled compared to an API.
So, there is no technical reason for deprecating the S3 gateway, just that the necessary IO to an external storage system carries with it an implication of performance degradation that may impact some use cases?
I'm thankful that the shared gateway should be able to be maintained in a separate project, but isn't the option for an integrated data backup solution that survives cluster shutdown important enough to remain for those use cases where ES is the main source of truth.
Perhaps kimchy's assertion that using the shared gateway as a backup solution is a misuse of the feature is what I don't fully understand. We embed ES as our only data persistence into a running web application. We run on EC2 instances which are under the control of Elastic Beanstalk. There are no EBS volumes to snapshot. We use S3 gateway to recover after full cluster shutdowns (version upgrades to our software). I agree that this is an unusual way of using ES as most people deploy ES as a server unto itself, but the advantages it gives us in dynamically shrinking and growing our cluster capacity is unrivaled because everything is self contained in the web app. I can run 1 server or 100 servers with the exact same WAR file deployed, and Amazon decides when to tear down or start up new servers solely by monitoring metrics. It's may seem a bit strange but it is also a beautiful thing to watch.