-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle SIGTERM gracefully #48
Comments
This is a good idea. On the one hand we can avoid this by rolling in new instances, but on the other, it's much easier to just run a deploy command in opsworks. It might be worth considering pushing the scope of this problem outside into an opsworks tools that rolls in the deploy for us. That way it's solved for any service in an opsworks layer. Or, maybe there's a way to not require to roll in deploys, but still handle this mostly outside the actual process. I wonder if we can unregister the instance from the elb, wait until it's unregistered, and then re-register it once it's restarted. I'm assuming the wait step here handles the connection draining for us, and that opsworks wouldn't fight us and try to re-register the instance because it's still in the layer in the interim. |
I think both mechanisms would be good to have. Rolling the deploy requires outside tooling, which is great for anything which is compatible with that. But I wouldn't be confident that it covers 100% of all cases that the service could be stopped. Handling SIGTERM internally is then a safety net in those (hopefully rare) cases that tileserver is stopped outside of a rolling deploy. |
Just curious, what kind of cases would this be? |
Upon receiving SIGTERM, tileserver should:
This allows tileserver to work with ELB connection draining or HAProxy to terminate while not dropping any requests. If this is done along with staggering shutdowns / upgrades so that only part of the cluster is down at any one time, then no requests are lost.
The text was updated successfully, but these errors were encountered: