Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alarm on single non-503 5xx error #645

Open
3 tasks
chris13524 opened this issue Apr 24, 2024 · 1 comment
Open
3 tasks

Alarm on single non-503 5xx error #645

chris13524 opened this issue Apr 24, 2024 · 1 comment
Labels
accepted The issue has been accepted into the project

Comments

@chris13524
Copy link
Member

chris13524 commented Apr 24, 2024

For chains where it is marked as guaranteed availability in SUPPORTED_CHAINS.md we should never return a 5xx error. So we should be alarmed on all 5xx errors for guaranteed chains.

  • Adjust ELB 5xx alarm to ignore 503 errors because we cannot differentiate between chains. And the ELB alarm is better than a Prometheus alarm on catching more potential 5xx errors.
  • Lower ELB 5xx threshold to 1
  • Add a new Prometheus metric for 503 errors and make a separate alarm that alarms on a single 503 error for a guaranteed chain.

Slack conversation

Copy link

linear bot commented Apr 24, 2024

@arein arein added the accepted The issue has been accepted into the project label Apr 24, 2024
@chris13524 chris13524 changed the title Alarm on single 5xx error that is under SLA Alarm on single non-503 5xx error Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted The issue has been accepted into the project
Projects
None yet
Development

No branches or pull requests

2 participants