Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Documentation for Manual Run of Security rules #5264

Closed
15 tasks done
Tracked by #184500 ...
nkhristinin opened this issue May 28, 2024 · 10 comments
Closed
15 tasks done
Tracked by #184500 ...

[Request] Documentation for Manual Run of Security rules #5264

nkhristinin opened this issue May 28, 2024 · 10 comments
Assignees
Labels
Docset: ESS Issues that apply to docs in the Stack release Docset: Serverless Issues for Serverless Security Effort: Large Issues that require significant planning, research, writing, and testing Feature: Rules Priority: High Issues that are time-sensitive and/or are of high customer importance Team: Detection Engine v8.16.0

Comments

@nkhristinin
Copy link
Contributor

nkhristinin commented May 28, 2024

Description

What: We introduce manual rule Runs for security solution.
Why: Users will be able to run rule in the past and specify the date range.

Use cases we cover:

  1. User form rule details page execute manual rule run and specify time rang
  2. They can go to Executions tab and see the backfill group, with information about how much tasks scheduled and in progress/pending
  3. They can stop the whole group for manual run
  4. They can see in execution log result of rule executions and filter by rule type

There additional issue for UX copy: #5265

A little bit of technical background, how it works, which should help with better naming

Let's say we have rule with 5m interval
rule execution log - it represents the results of a single rule execution. it can be running/succeded/failed.
When the user executes Manual rule runs (14:00-16:00)- it creates the Backfill group (we probably need come up with better naming).
Backfill group - it's something, that contains the start and end date range, status of the whole group, and rule info.
Also Backfill group has scheduled entries - it's a list of tasks of potenial rule executions.
When the task manager is free it starts to schedule those tasks - which execute the rule, and then the result of this execution appears in the rule execution log.
scheduled entry - can be pending/running/error/complete
The whole backfill group also can be pending/running/error - depends on status of scheduled entires.
After all scheduled entries are complete - Backfill group is deleted.
We can Delete/Stop only the whole backfill group, but not individual backfill group.

Background & resources

Which documentation set does this change impact?

ESS and serverless

ESS release

8.15 8.16

Serverless release

Monday, July 29, 2024 Tuesday, October 14, 2024

Feature differences

None

API docs impact

Prerequisites, privileges, feature flags

None

Doc plan

  • Update ESS and Serverless UI docs:
    • Base functionality
    • Limitations and best practices: What we discussed in Slack:
      • Manual runs will execute with low priority and limited concurrency, meaning they might take longer to complete, especially if there are many rules to backfill. Scheduled rules has higher priority and some limited concurrency. But limit is higher
      • Rule parameters are fixed at the time of manual rule run scheduling, meaning any updates to the rules afterward will not be reflected in the manual rule run. For example:
        1. We create a rule - Name - "Rule 1", interval - 5 min, query: "*"
        2. We execute manual rule run, and let's say it's long process taking hours
        3. Immediately after that we a changing rule name to "Rule 2", interval to 10 hours and query to "host.name:2"
        4. All rule execution which are planned after manual rule run executed will use original parameters - Name - "Rule 1", interval - 5 min, query: "*"
      • Users are responsible for ensuring the correct order of backfill jobs (manual rule runs) if dependencies exist between them. Incorrect scheduling can lead to inconsistent data filling.
    • The kibana.alert.intended_timestamp field has been added to the alert schema. This field appears in documents of alerts that were generated by manual rule runs. They convey the estimated time range of when the alert was created.
  • Update rule API docs:
  • Add known issues (noted in the comments) to the 8.16 release notes - Noted at 8.16.0 Release notes #5941

Doc updates

NOTE: The feature is being released in Tech Preview in 8.15, so will need to use that label/admonition for ESS and Serverless docs.

Execution results: Make the following updates:

  • In the intro para, add that users can now see how a rule was executed. Now, rules can be executed manually or auto-executed on a (preset?) schedule.
  • Update rule-execution-logs.png - new image should show the updated Execution log table and the new Manual runs table.
  • Add two bullets to the list of controls that allow users to filter what's in the Execution log table. The new items are:
    • Run type - New filter that allows users to display the execution type of a run. The two options are Manual and Scheduled.
    • Show source event time range: Toggling this setting on displays the Source event time range column. By default, this setting is toggled off.

Manage detection rules:

  • Add an item to the list of actions that users can take from the Rules page. A good place might be after the "manage rules" list item.
  • Add a new section for manually running rules. The new section should include the following details:
    • There are 3 different places where users can manually run a rule:
      • The rule's details page (go to the actions menu in the top right and select Manual run)
      • On the Rule pages, from the actions menu for an individual rule
      • On the Rule page, from the Bulk actions menu (up to 100 rules can be selected for a bulk-run)
    • When selecting a range of time to manually run a rule, you can only choose a past date and time.
      • If you don't change the default timerange selection, the rule will run today and the start time of the run will be 3hrs in the past.
    • Rules must be enabled for you to manually run them.
    • To stop a manual run, go to the Manual runs table and click Stop run in the Actions column.
      Stop rule run
      Screenshot 2024-07-21 at 1 49 38 PM
@nastasha-solomon nastasha-solomon self-assigned this May 29, 2024
@nastasha-solomon nastasha-solomon added Team: Detection Engine Priority: High Issues that are time-sensitive and/or are of high customer importance Effort: Large Issues that require significant planning, research, writing, and testing Docset: Serverless Issues for Serverless Security Docset: ESS Issues that apply to docs in the Stack release v8.15.0 Feature: Rules labels Jun 4, 2024
e40pud added a commit to elastic/kibana that referenced this issue Jun 11, 2024
## Summary

Main ticket elastic/security-team#9327

With this changes we introduce the way to schedule rule run manually.
There are two ways to do that in UI:
1. Via "All actions" button on rules management page
2. Via "All actions" button on rule's details page

**NOTES**:
1. To be able to test these changes, you need to enable feature flag
`manualRuleRunEnabled` first
2. Bulk action will be part of a separate ticket/PR

**RECORDING**:


https://github.com/elastic/kibana/assets/2700761/d49bad53-026e-49c2-aeea-481203260b23

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
    - [x] elastic/security-docs#5264
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [x] [Cypress RM (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6263)
- [ ] [Cypress DE (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6280)
- [x] [Integration Rule Gaps (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6257)

---------

Co-authored-by: Kibana Machine <[email protected]>
Co-authored-by: Ryland Herrick <[email protected]>
@nkhristinin
Copy link
Contributor Author

We should mention Manual rule limitations - to users in docs

e40pud added a commit to elastic/kibana that referenced this issue Jun 27, 2024
…run (#9653) (#186293)

Main ticket elastic/security-team#9653

With this changes we introduce a new bulk action which allows to
schedule backfill for multiple rules.

**NOTES**:
- To be able to test these changes, you need to enable feature flag
`manualRuleRunEnabled` first

**RECORDING**:


https://github.com/elastic/kibana/assets/2700761/742083e7-090e-4805-8c3d-abcba04554b1

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
    - [x] elastic/security-docs#5264
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [x] [Cypress RM (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6410)
- [x] [Cypress DE (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6411)
- [x] [Integration Rule Gaps (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6412)
- [x] [Integration Bulk Actions (100 ESS & 100
Serverless)](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6413)

---------

Co-authored-by: Kibana Machine <[email protected]>
@e40pud
Copy link
Contributor

e40pud commented Jun 28, 2024

API docs impact

We need to update next sections:

  1. Bulk action > Request body: add run as a new possible action's value
  2. Bulk action > Request body: add new property
Name Type Description Required
run BulkManualRuleRun[] Object that describes applying an manual rule run action. No. Yes, if action is run.
  1. We should add new type description similar to BulkDuplicateAction object and BulkEditAction object:

BulkManualRuleRun

  • start_date field: (String, Required). Defines start date of the manual rule run.
  • end_date field: (String, Optional). Defines end date of the manual rule run.
  1. Response payload: add run to the actions list. Also, we should mention that we use attributes.results.updated to return rule objects that were scheduled for manual rule run during the run action execution.

@nkhristinin
Copy link
Contributor Author

nkhristinin commented Jul 2, 2024

Known issue for Threshold rule:
If manual rule run cover the date range, which was already covered by scheduled rule executions it can produce duplicate errors.

Use case:

Let's say I have 4 events: 13:00, 14:00, 15:00, 16:00
Threshold rule - with 5 minute interval and 10 hours lookback time and threshold = 2
Normal rule execution - give me 2 alerts (14:00, 16:00)
If I run manual rule run from 12:00-16:10
I will have 2 alerts generated at 14:00 and 16:00.
Probably user should expect 0 alerts from manual rule run, as it was already covered by scheduled rule execution

@nastasha-solomon nastasha-solomon changed the title [Request] Documentation for Manul Run of Security rules [Request] Documentation for Manual Run of Security rules Jul 2, 2024
@nkhristinin
Copy link
Contributor Author

Known issue 2:
Suppression count for custom query rule can be updated wrong
https://github.com/elastic/security-team/issues/9870

@nastasha-solomon
Copy link
Contributor

General notes from today's feature sync:

  • @nastasha-solomon will aim to have docs done in time for the Monday, July 29, 2024 Serverless release.
  • @nkhristinin will check if it's possible to only enable the feature flag for this feature in ESS.
  • @nkhristinin or @e40pud to provide a test env for docs testing and screenshots either this week, or after Nastasha is back from PTO (which is the week of July 15)

@nkhristinin
Copy link
Contributor Author

nkhristinin commented Jul 29, 2024

Trying to rewrite known Issues:

  1. Manual rule run for threshold rule can produce duplicated alerts, if date range was already covered by regular rule execution.

No workaround

  1. Manual rule run for custom query rule with suppression can produce wrong (bigger) number of doc count.

No workaround

@nastasha-solomon
Copy link
Contributor

Holding off on merging the ESS and Serverless PRs until I learn more about the new gap fill functionality and how it affects the current manual run docs. Should know more by Wednesday Sep 25.

@nastasha-solomon
Copy link
Contributor

This feature is being released in beta in 8.16. I'll need to update the ESS and Serverless docs to make sure they show the beta label, and not the technical preview label.

@nastasha-solomon
Copy link
Contributor

nastasha-solomon commented Oct 14, 2024

The feature flag that enabled the following core manual run functionality in Serverless was merged last week via elastic/kibana#193833. Two new alert fields are being introduced as part of the manual run feature and are being released on slightly staggered timelines:

  • kibana.alert.intended_timestamp: The PR that adds this field was merged before the feature flag was removed, so the field is included in this week's Serverless release.
  • kibana.alert.rule.execution_type: The PR that adds this field was merged today, so it will be included in next week's Serverless release.

Action items for me:

@nastasha-solomon
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docset: ESS Issues that apply to docs in the Stack release Docset: Serverless Issues for Serverless Security Effort: Large Issues that require significant planning, research, writing, and testing Feature: Rules Priority: High Issues that are time-sensitive and/or are of high customer importance Team: Detection Engine v8.16.0
Projects
None yet
Development

No branches or pull requests

3 participants