-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chained Alert Behaviour Changes #1079
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1079 +/- ##
============================================
- Coverage 67.72% 67.65% -0.08%
Complexity 105 105
============================================
Files 160 160
Lines 10343 10439 +96
Branches 1522 1545 +23
============================================
+ Hits 7005 7062 +57
- Misses 2672 2705 +33
- Partials 666 672 +6
|
else Alert.State.ERROR | ||
Alert( | ||
startTime = Instant.now(), | ||
lastNotificationTime = Instant.now(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use the same variable we defined in line 286?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
|
||
val queryBuilder = QueryBuilders.boolQuery() | ||
.must(QueryBuilders.termQuery(Alert.WORKFLOW_ID_FIELD, workflowId)) | ||
.must(QueryBuilders.termQuery(Alert.MONITOR_ID_FIELD, "")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can there be a doc which has both workflowId
& monitorId
field set with respective ids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alert created from composite monitor trigger will have only empty monitorId
.
The underlying alerts created in AUDIT state will have both monitorId
and worklfowId
.
) { | ||
|
||
constructor( | ||
workflow: Workflow, | ||
workflowRunResult: WorkflowRunResult, | ||
trigger: ChainedAlertTrigger, | ||
alertGeneratingMonitors: Set<String>, | ||
monitorIdToAlertIdsMap: Map<String, Set<String>> | ||
monitorIdToAlertIdsMap: Map<String, Set<String>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make periodStart
in the default constructor nullable
& remove this constructor? Lot of similar fields between this one & the default one.
472bf28
to
e1a7de7
Compare
.groupBy { it.triggerId } | ||
foundAlerts.values.forEach { alerts -> | ||
if (alerts.size > 1) { | ||
logger.warn("Found multiple alerts for same trigger: $alerts") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we try to resolve the older alert if this occurs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be a follow up. But if this is not expected, we can ignore this unless we see issues for this.
startTime = Instant.now(), | ||
lastNotificationTime = currentTime, | ||
state = Alert.State.ACTIVE, | ||
errorMessage = null, schemaVersion = -1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should schema version be: IndexUtils.alertIndexSchemaVersion
instead of -1
?
@@ -758,6 +845,37 @@ class AlertService( | |||
return searchResponse | |||
} | |||
|
|||
/** | |||
* Searches for Alerts in the monitor's alertIndex. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets call out here that this will not retrieve audit alerts.
constructor( | ||
workflow: Workflow, | ||
workflowRunResult: WorkflowRunResult, | ||
trigger: ChainedAlertTrigger, | ||
alertGeneratingMonitors: Set<String>, | ||
monitorIdToAlertIdsMap: Map<String, Set<String>> | ||
) : | ||
this( | ||
workflow, | ||
workflowRunResult, | ||
workflowRunResult.executionStartTime, | ||
workflowRunResult.executionEndTime, | ||
workflowRunResult.error, | ||
trigger, | ||
alertGeneratingMonitors, | ||
monitorIdToAlertIdsMap | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we dont need the constructor anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the data class itself gives the default constructor
#1079 (comment)
return workflowRunResult.copy(error = e) | ||
} | ||
try { | ||
monitorCtx.alertIndices!!.createOrUpdateAlertIndex(dataSources) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this since we do this on line 154?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary prolly, but I think this would be consistent practice to all other monitors where we ensure alertIndex is present before we fetch current alert.
* chained alert behavior changes Signed-off-by: Surya Sashank Nistala <[email protected]> * address comments Signed-off-by: Surya Sashank Nistala <[email protected]> * update comment for search alerts method Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
* chained alert behavior changes Signed-off-by: Surya Sashank Nistala <[email protected]> * address comments Signed-off-by: Surya Sashank Nistala <[email protected]> * update comment for search alerts method Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> (cherry picked from commit 0f2dec7) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/alerting/backport-2.9 2.9
# Navigate to the new working tree
pushd ../.worktrees/alerting/backport-2.9
# Create a new branch
git switch --create backport-1079-to-2.9
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0f2dec7cb660f232e05846a754d871246851e94e
# Push it to GitHub
git push --set-upstream origin backport-1079-to-2.9
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/alerting/backport-2.9 Then, create a pull request where the |
* chained alert behavior changes * address comments * update comment for search alerts method --------- (cherry picked from commit 0f2dec7) Signed-off-by: Surya Sashank Nistala <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Context:
Workflow/Composite monitor supports triggers whose conditions are configured as Boolean conditions with AND, OR, NOT
Operators and monitors as variables. For ex. Trigger condition M1 && M2 means trigger condition is matched if both monitors M1 and M2 create alerts (in AUDIT state) in the current execution. If workflow trigger condition is matched we create Chained Alert.
Current behavior
Alerts are created in ACTIVE state.
When alert are acknowledged they move to ACKNOWLEDGED state.
New behavior introduced with this PR
- If no ACTIVE/ACKNOWLEDGEDstate alert already exists for composite monitor trigger, we create a new ACTIVE state alert.
- If ACTIVE/ACKNOWLEDGED state alert already exists, we update that alert’s lastNotificationTime itself. But we do NOT create a new Alert. The pre-existing ACTIVE state alert will remain in ACTIVE state.
- When composite monitor trigger condition is NOT matched:
If ACTIVE/ACKNOWLEDGED state alert already exists, we would mark it as COMPLETED.