Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Lambda warming on Tilegarden function #712

Merged
merged 3 commits into from
Feb 15, 2019

Conversation

KlaasH
Copy link
Contributor

@KlaasH KlaasH commented Feb 14, 2019

Overview

Since the Tilegarden Lambda functions are inside a VPC (which they need to be, since they talk to the database), their cold start times are bad. This adds the lambda-warmer package and configures the Lambda function's entrypoint to handle the type of invocations that trigger it.

I initially did something much more complicated (see branch), involving a warming endpoint that could be triggered from outside and would in turn trigger a lambda-warmer event. This was based on the fact that

  • to be effective, we'll need to have a decent number of workers warmed up and
  • the site doesn't get continuous usage, so keeping the Lambdas warm 24/7 would waste a lot of invocations.

I changed to this simpler strategy when I realized

  • the triggering endpoint would itself incur a cold start delay, so the time from sending the triggering request to the tiler being ready would be double the cold start time, which would be much too long
  • keeping a small fleet of Lambda functions warm is really not expensive. Using this calculator, with the 512MB memory size and assuming triggering every 15 minutes and the cumulative execution time of all the warming invocations per event adding up to 10 seconds (which I think is reasonable to conservative), it would cost 25 cents per month.

This adds a section to the deployment README about setting up the scheduled warming event. I'm not sure how to do it in Terraform, and though it doesn't look super complicated, it would create another circular dependency between the Terraform config and the resources produced by the Tilegarden deployment script. And resolving that seems somewhat complicated. So I'm adding instructions to do it manually and an issue to hopefully automate it later.

It also adds a section to the deployment README about manually setting permissions on the executor role, which is issue #710.

Demo

image

Notes

A note on warming interval: There are no guarantees or even stated norms for how long a Lambda instance will sit idle before getting destroyed. People have done experiments and come up with "usually about 45 minutes, but sometimes longer and sometimes shorter". Using a timeout of 15 minutes seems reasonable to me--conservative but not excessively so.

Testing Instructions

I don't think this can really be tested locally. At the moment, it has been pushed to staging via a test branch and is running there.

Resolves #701

@@ -46,6 +46,7 @@
},
"dependencies": {
"claudia-api-builder": "^4.1.0",
"lambda-warmer": "^1.0.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest release is 1.1.0.

Copy link
Contributor Author

@KlaasH KlaasH Feb 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird. I thought I had added that via yarn add just a few days ago. Will fix. Thanks for checking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. It was actually installing 1.1.0 already, since ^ only pins major version...

Copy link
Contributor

@jeancochrane jeancochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice solution here to a tricky problem. I tested by confirming that tiles continue to load properly on staging (first impression is that they feel faster than last time I looked at staging) and checking the CloudWatch logs to confirm that the warmer is running every 15 minutes or so.

It's definitely not great that the circular dependencies between Tilegarden and Claudia continue to deepen, but I think you did a good job encapsulating possible architecture improvements in future issues, and for now the extra documentation seems like it'll give us the context we need to move forward.


In the [CloudWatch Rules console](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#rules:),
add a scheduled event to generate warming invocations to prevent users from being subjected to cold starts.
See [issue #714](https://github.com/azavea/pfb-network-connectivity/issues/714).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you set up either the permissions or the rule in production yet? If not, we should make sure that the issue for the next production deploy captures this work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me realize that the resources already exist and having the permission attached before the code to use it is deployed won't do any harm, so I just added the permission. The CloudWatch rule should happen after the deploy, since the code that's up now won't know what to do with the events, so it would just generate errors. I made an issue for the deploy (#717) and added a note to it.

@KlaasH KlaasH mentioned this pull request Feb 15, 2019
@KlaasH KlaasH force-pushed the feature/kjh/lambda-warming branch from ef20aa2 to ea6b9b9 Compare February 15, 2019 17:16
Since 'lambda-warmer' invocations don't expect to be going through API
Builder (and API builder wouldn't know what to do with them), adds a
wrapper around the API Builder entrypoint to intercept and handle
'lambda-warmer' invocations.
There are a few pieces of Tilegarden that still need to be brought into
the Terraform fold. In the meantime, this documents how to do them by hand.
@KlaasH KlaasH force-pushed the feature/kjh/lambda-warming branch from ea6b9b9 to 36bc3d3 Compare February 15, 2019 17:43
@KlaasH
Copy link
Contributor Author

KlaasH commented Feb 15, 2019

I rebased this onto develop and pushed it to the test/ branch to get it back on staging.

Copy link
Contributor

@flibbertigibbet flibbertigibbet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; I don't have anything to add to this discussion.

@KlaasH KlaasH merged commit 1b86986 into develop Feb 15, 2019
@KlaasH KlaasH deleted the feature/kjh/lambda-warming branch February 15, 2019 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants