Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chained Alert Behaviour Changes #1079

Merged
merged 3 commits into from
Aug 25, 2023
Merged

Conversation

eirsep
Copy link
Member

@eirsep eirsep commented Aug 4, 2023

Context:

Workflow/Composite monitor supports triggers whose conditions are configured as Boolean conditions with AND, OR, NOT
Operators and monitors as variables. For ex. Trigger condition M1 && M2 means trigger condition is matched if both monitors M1 and M2 create alerts (in AUDIT state) in the current execution. If workflow trigger condition is matched we create Chained Alert.

Current behavior

Alerts are created in ACTIVE state.
When alert are acknowledged they move to ACKNOWLEDGED state.

New behavior introduced with this PR

- If no ACTIVE/ACKNOWLEDGEDstate alert already exists for composite monitor trigger, we create a new ACTIVE state alert.
- If ACTIVE/ACKNOWLEDGED state alert already exists, we update that alert’s lastNotificationTime itself. But we do NOT create a new Alert. The pre-existing ACTIVE state alert will remain in ACTIVE state.
- When composite monitor trigger condition is NOT matched:
If ACTIVE/ACKNOWLEDGED state alert already exists, we would mark it as COMPLETED.

@codecov
Copy link

codecov bot commented Aug 4, 2023

Codecov Report

Merging #1079 (e1a7de7) into main (778e7ce) will decrease coverage by 0.08%.
Report is 1 commits behind head on main.
The diff coverage is 68.62%.

❗ Current head e1a7de7 differs from pull request most recent head b41a77e. Consider uploading reports for the commit b41a77e to get more accurate results

@@             Coverage Diff              @@
##               main    #1079      +/-   ##
============================================
- Coverage     67.72%   67.65%   -0.08%     
  Complexity      105      105              
============================================
  Files           160      160              
  Lines         10343    10439      +96     
  Branches       1522     1545      +23     
============================================
+ Hits           7005     7062      +57     
- Misses         2672     2705      +33     
- Partials        666      672       +6     
Files Changed Coverage Δ
...org/opensearch/alerting/workflow/WorkflowRunner.kt 1.13% <0.00%> (-0.03%) ⬇️
...n/kotlin/org/opensearch/alerting/TriggerService.kt 81.57% <50.00%> (-0.86%) ⬇️
...ting/transport/TransportGetWorkflowAlertsAction.kt 67.42% <66.66%> (-0.55%) ⬇️
...earch/alerting/workflow/CompositeWorkflowRunner.kt 72.01% <67.24%> (+1.24%) ⬆️
...ain/kotlin/org/opensearch/alerting/AlertService.kt 79.56% <71.95%> (-2.02%) ⬇️
...ting/script/ChainedAlertTriggerExecutionContext.kt 58.82% <100.00%> (-14.26%) ⬇️

... and 3 files with indirect coverage changes

else Alert.State.ERROR
Alert(
startTime = Instant.now(),
lastNotificationTime = Instant.now(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the same variable we defined in line 286?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack


val queryBuilder = QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery(Alert.WORKFLOW_ID_FIELD, workflowId))
.must(QueryBuilders.termQuery(Alert.MONITOR_ID_FIELD, ""))
Copy link
Collaborator

@sbcd90 sbcd90 Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can there be a doc which has both workflowId & monitorId field set with respective ids?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alert created from composite monitor trigger will have only empty monitorId.

The underlying alerts created in AUDIT state will have both monitorId and worklfowId.

) {

constructor(
workflow: Workflow,
workflowRunResult: WorkflowRunResult,
trigger: ChainedAlertTrigger,
alertGeneratingMonitors: Set<String>,
monitorIdToAlertIdsMap: Map<String, Set<String>>
monitorIdToAlertIdsMap: Map<String, Set<String>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make periodStart in the default constructor nullable & remove this constructor? Lot of similar fields between this one & the default one.

@eirsep eirsep force-pushed the currentmain branch 2 times, most recently from 472bf28 to e1a7de7 Compare August 7, 2023 22:27
.groupBy { it.triggerId }
foundAlerts.values.forEach { alerts ->
if (alerts.size > 1) {
logger.warn("Found multiple alerts for same trigger: $alerts")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we try to resolve the older alert if this occurs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be a follow up. But if this is not expected, we can ignore this unless we see issues for this.

startTime = Instant.now(),
lastNotificationTime = currentTime,
state = Alert.State.ACTIVE,
errorMessage = null, schemaVersion = -1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should schema version be: IndexUtils.alertIndexSchemaVersion instead of -1?

@@ -758,6 +845,37 @@ class AlertService(
return searchResponse
}

/**
* Searches for Alerts in the monitor's alertIndex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets call out here that this will not retrieve audit alerts.

Comment on lines -24 to -40
constructor(
workflow: Workflow,
workflowRunResult: WorkflowRunResult,
trigger: ChainedAlertTrigger,
alertGeneratingMonitors: Set<String>,
monitorIdToAlertIdsMap: Map<String, Set<String>>
) :
this(
workflow,
workflowRunResult,
workflowRunResult.executionStartTime,
workflowRunResult.executionEndTime,
workflowRunResult.error,
trigger,
alertGeneratingMonitors,
monitorIdToAlertIdsMap
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need the constructor anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the data class itself gives the default constructor
#1079 (comment)

return workflowRunResult.copy(error = e)
}
try {
monitorCtx.alertIndices!!.createOrUpdateAlertIndex(dataSources)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this since we do this on line 154?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary prolly, but I think this would be consistent practice to all other monitors where we ensure alertIndex is present before we fetch current alert.

eirsep added a commit that referenced this pull request Aug 16, 2023
* chained alert behavior changes

Signed-off-by: Surya Sashank Nistala <[email protected]>

* address comments

Signed-off-by: Surya Sashank Nistala <[email protected]>

* update comment for search alerts method

Signed-off-by: Surya Sashank Nistala <[email protected]>

---------

Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
@eirsep eirsep merged commit 0f2dec7 into opensearch-project:main Aug 25, 2023
13 of 14 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 25, 2023
* chained alert behavior changes

Signed-off-by: Surya Sashank Nistala <[email protected]>

* address comments

Signed-off-by: Surya Sashank Nistala <[email protected]>

* update comment for search alerts method

Signed-off-by: Surya Sashank Nistala <[email protected]>

---------

Signed-off-by: Surya Sashank Nistala <[email protected]>
(cherry picked from commit 0f2dec7)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.9 failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/alerting/backport-2.9 2.9
# Navigate to the new working tree
pushd ../.worktrees/alerting/backport-2.9
# Create a new branch
git switch --create backport-1079-to-2.9
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0f2dec7cb660f232e05846a754d871246851e94e
# Push it to GitHub
git push --set-upstream origin backport-1079-to-2.9
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/alerting/backport-2.9

Then, create a pull request where the base branch is 2.9 and the compare/head branch is backport-1079-to-2.9.

@eirsep eirsep removed backport 2.9 backports PRs to 2.9 failed backport labels Aug 25, 2023
eirsep pushed a commit that referenced this pull request Sep 7, 2023
* chained alert behavior changes



* address comments



* update comment for search alerts method



---------


(cherry picked from commit 0f2dec7)

Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants