Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vimc-4530: Add monitoring of buildkite agents #29

Merged
merged 4 commits into from
Mar 3, 2022
Merged

Conversation

r-ash
Copy link
Contributor

@r-ash r-ash commented Feb 17, 2022

This uses https://github.com/buildkite/buildkite-agent-metrics

This just checks that there are more than 10 running, but gives us the ability to check some more detailed metrics see https://github.com/buildkite/buildkite-agent-metrics#metrics

As part of this I have added docker build from buildkite-agent-metrics Dockerfile which is here https://github.com/reside-ic/buildkite-agent-metrics it looks like they don't build an official one buildkite/buildkite-agent-metrics#51 and other ones are out of date. I think I should probably set this to run on a schedule to keep it up to date, every month? I feel like this is a bit fragile to the structure of that repo changing so would be keen to hear alternatives

I've deployed this branch and seen it working, you can see an alert here https://teams.microsoft.com/l/message/19:[email protected]/1645118508085?tenantId=2b897507-ee8c-4575-830b-4f8267c3d307&groupId=ba231111-1572-42ae-981e-c8bc7aa681ef&parentMessageId=1645118508085&teamName=DIDE%20-%20WP&channelName=reside-monitor&createdTime=1645118508085

annotations:
error: "buildkite-agent-metrics is down"
- alert: AgentsDown
expr: buildkite_total_total_agent_count{job="buildkite-metrics"} < 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make this queue specific

@r-ash r-ash requested a review from richfitz March 3, 2022 09:42
@richfitz richfitz merged commit 8b78a55 into master Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants