-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[consul] add maintenance metric to Consul catalog. #1267
Conversation
This PR is a dependency to DataDog/integrations-core#1267.
93f344f
to
29f2d71
Compare
ff6171d
to
32be53d
Compare
Currently, if a node in Consul is set into maintenance state, it is reported as a node in critical state to Datadog metrics. This makes it confusing to determine by the metric whether it is a node failing with problems or an planned intervention. A user made a PR to Datadog some time ago, but it was not merged due to Datadog code organization changes and got forgotten (DataDog/dd-agent#2496). I'm pushing the change forward. I've tested it using dd-agent version 5.8.0. This PR also depends on DataDog/dd-agent#3708
32be53d
to
d69de86
Compare
So, the error being reported is the following.
As this PR has a dependency on DataDog/dd-agent#3708, this error is kinda expected. So, either I hard code the int (4) in the What would be your suggestion, @masci ? |
Hi @diogokiss We discussed the opportunity to merge DataDog/dd-agent#3708 but we won't proceed for the following reasons:
Is there any chance you can use |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. Note that the issue will not be automatically closed, but this notification will remind us to investigate why there's been inactivity. Thank you for participating in the Datadog open source community. |
Closing for lack of activity |
What does this PR do?
It adds two metrics called
consul.catalog.services_maintenance
andconsul.catalog.nodes_maintenance
to Datadog Consul integration catalog.The expected behavior is explained below.
All nodes in healthy states
One (1) node in faulty
Consul is failing for the faulty node
Metrics per node are:
The faulty node is put in maintenance
Consul is failing for the faulty node and showing it also in maintenance
Metrics per node are:
Motivation
What inspired you to submit this pull request?
Currently, if a node in Consul is set into maintenance state, it is reported
as a node in critical state to Datadog metrics. This makes it confusing to
determine by the metric whether it is a node failing with problems or an planned
intervention.
A user made a PR to Datadog some time ago, but it was not merged due to
Datadog code organization changes and got forgotten (DataDog/dd-agent#2496).
I'm pushing the change forward.
Testing Guidelines
An overview on testing
is available in our contribution guidelines.
Versioning
manifest.json
datadog_checks/{integration}/__init__.py
CHANGELOG.md
. Please useUnreleased
as the date in the titlefor the new section.
New maintenance metrics for Consul integration documentation#2187
Additional Notes
This PR also depends on DataDog/dd-agent#3708.
The screenshots and tests were done using dd-agent version 5.8.0.
I've ported the code to the latest version here available, but didn't have the opportunity to actually test it against the latest version. Anyways, I added tests, which I believe to cover the functionality. Please, let me know how to improve it.