Use Telemetry Controller Under Test #307
Labels
area/monitoring
Monitoring (including availability monitoring and alerting) related
area/quality
Output qualification (tests, checks, scans, automation in general, etc.) related
kind/enhancement
Enhancement, improvement, extension
lifecycle/rotten
Nobody worked on this for 12 months (final aging stage)
topology/shoot
Affects Shoot clusters
What would you like to be added:
@dkistner has implemented a "telemetry controller" that keeps track of the control plane availability. It would make sense to have it observing the state of clusters under reconciliation/maintenance/test and report this metric to alert about poor shoot cluster control plane availability and eventually break the release/transport if KPIs are not met. Or shall this be part of the specific Gardener tests instead?
Why is this needed:
We sometimes miss issues here and lack repeating test results of this most important metric (it is the only metric relevant in our SLO).
The text was updated successfully, but these errors were encountered: