-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metric Push API: Only alarm on 2xx errors #2655
Conversation
Take the total count and subtract 4xx and 5xx.
how does it know whether to return 2xx or something else? |
Good question! It doesn't (yet). I'm hoping to vary the response based on a header using the integration request template. |
Value: !FindInMap [StageMap, !Ref Stage, ApiName] | ||
Metrics: | ||
- Id: mtotal2xx | ||
Expression: 'mtotalcount - (m5xxcount + m4xxcount)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does seem sparse that we get so few metrics to look at comapred with what Load balancers have.
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-metrics-and-dimensions.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I know! It's surprising we have to derive this ourselves.
Conditions: | ||
CreateProdMonitoring: !Equals [ !Ref Stage, PROD ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is to update it to have the alarm in PROD and CODE, but we just don't trigger in CODE.
I personally think that makes it easier to test so I approve 👍 even with the extra clutter in the AWS console.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, good idea
What does this change?
I'm attempting to make the metric push API alarm less noisy. The strategy will be to return a non-2xx response code for clients we don't care about. In order to make this work we'll need to be able to alarm only on genuine 2xx responses.
How to test
I've deployed to CODE and made several requests to the metric push API. 3 returned a 2xx, 1 returned a 4xx. This can be seen on the following graph:
Our new alarm metric
m2xxcount
, which we'll alarm on, reports a sum of 3 for the same period (i.e. total of 4 - 1 4xx error):The alarm went into an alarm state as expected after 3 datapoints within 5 minutes were above the threshold: