Data Pruning for Anomaly Detection #1193

aayush-se · 2024-09-24T18:04:54Z

Create Celery task to delete data that is over 28 days old and update matrix profiles for remaining points
Update DbDynamicAlert to include data_purge_flag and queued_at columns to reflect Celery task status
Write unit tests for new accessor methods and cleanup task

tests/seer/anomaly_detection/test_cleanup_tasks.py

corps · 2024-09-24T18:22:34Z

Before we turn this on in production with ENABLE_CELERY_WORKERs, lets get #1192 in and add move this beat task to a conditional for anomaly detection.

src/seer/anomaly_detection/tasks.py

src/migrations/versions/480ca2916d86_migration.py

src/migrations/versions/560f663c1cfd_migration.py

src/seer/anomaly_detection/accessors.py

src/seer/anomaly_detection/models/external.py

src/seer/anomaly_detection/anomaly_detection.py

src/seer/anomaly_detection/models/external.py

src/migrations/versions/560f663c1cfd_migration.py

src/seer/anomaly_detection/tasks.py

tests/seer/anomaly_detection/test_cleanup_tasks.py

ram-senth · 2024-09-25T23:11:57Z

We also need some metrics published from the Celerey task so that we can keep track of:

number of clean ups done in a time window
time duration taken for each cleanup so we can monitor p90 or p99
count of times when a cleanup results in the alert having less than 7 days of data
number of times cleanup task encountered error

These are in addition to the regular queue related metrics like queue size, time spent waiting etc. We should check with Jenn if we need to do anything specific for those to work.

aayush-se · 2024-09-26T01:11:04Z

We also need some metrics published from the Celerey task so that we can keep track of:

number of clean ups done in a time window

time duration taken for each cleanup so we can monitor p90 or p99

count of times when a cleanup results in the alert having less than 7 days of data

number of times cleanup task encountered error

These are in addition to the regular queue related metrics like queue size, time spent waiting etc. We should check with Jenn if we need to do anything specific for those to work.

I did include Sentry tracing for the cleanup task so I think everything aside from the third point (less than 7 days of data) should be viewable. I'm unsure about the other queue related metrics though since Sentry doesn't support direct metrics anymore (https://docs.sentry.io/product/explore/metrics/).

src/seer/db.py

ram-senth · 2024-09-26T01:24:27Z

src/seer/db.py

@@ -58,6 +59,8 @@ class Base(DeclarativeBase):
 migrate = Migrate(directory="src/migrations")
 Session = sessionmaker(autoflush=False, expire_on_commit=False)

+TaskStatus = Literal["not_queued", "processing", "queued"]


We can convert this to string enum and use it in both the business logic as well as in the data layer.

src/seer/anomaly_detection/models/external.py

ram-senth · 2024-09-26T01:32:10Z

We also need some metrics published from the Celerey task so that we can keep track of:

number of clean ups done in a time window

time duration taken for each cleanup so we can monitor p90 or p99

count of times when a cleanup results in the alert having less than 7 days of data

number of times cleanup task encountered error

These are in addition to the regular queue related metrics like queue size, time spent waiting etc. We should check with Jenn if we need to do anything specific for those to work.

I did include Sentry tracing for the cleanup task so I think everything aside from the third point (less than 7 days of data) should be viewable. I'm unsure about the other queue related metrics though since Sentry doesn't support direct metrics anymore (https://docs.sentry.io/product/explore/metrics/).

Wondering if we should add the alert id as a tag so that we can look at metrics at an alert level. Let us ask Zach and Jenn about the celery metrics in the next standup. Rest is good.

src/migrations/versions/994073bde961_migration.py

aayush-se requested a review from a team September 24, 2024 18:09

aayush-se changed the title ~~Anomaly detection/data pruning~~ Data Pruning for Anomaly Detection Sep 24, 2024

corps reviewed Sep 24, 2024

View reviewed changes

tests/seer/anomaly_detection/test_cleanup_tasks.py Show resolved Hide resolved

corps approved these changes Sep 24, 2024

View reviewed changes

aayush-se commented Sep 24, 2024

View reviewed changes

src/seer/anomaly_detection/tasks.py Outdated Show resolved Hide resolved

aayush-se commented Sep 25, 2024

View reviewed changes

src/seer/anomaly_detection/tasks.py Outdated Show resolved Hide resolved

ram-senth reviewed Sep 25, 2024

View reviewed changes

ram-senth force-pushed the anomaly-detection/data-pruning branch 3 times, most recently from dcb45bb to d36cbde Compare September 25, 2024 22:17

ram-senth reviewed Sep 25, 2024

View reviewed changes

src/migrations/versions/560f663c1cfd_migration.py Outdated Show resolved Hide resolved

src/seer/anomaly_detection/tasks.py Outdated Show resolved Hide resolved

tests/seer/anomaly_detection/test_cleanup_tasks.py Show resolved Hide resolved

aayush-se force-pushed the anomaly-detection/data-pruning branch from 9d548a9 to da71d7f Compare September 26, 2024 01:24

ram-senth reviewed Sep 26, 2024

View reviewed changes

ram-senth force-pushed the anomaly-detection/data-pruning branch from 984748c to be0a93e Compare September 26, 2024 05:13

feat(anomaly-detection) Pruning alert data

85077c6

ram-senth force-pushed the anomaly-detection/data-pruning branch from be0a93e to 85077c6 Compare September 26, 2024 05:19

ram-senth reviewed Sep 26, 2024

View reviewed changes

src/migrations/versions/994073bde961_migration.py Outdated Show resolved Hide resolved

aayush-se added 4 commits September 26, 2024 11:30

Fix tests and add extra checks

b600dd1

Type errors

ab19c88

Fix types mypy

81d1dcd

Increase test coverage

6445235

ram-senth approved these changes Sep 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Pruning for Anomaly Detection #1193

Data Pruning for Anomaly Detection #1193

aayush-se commented Sep 24, 2024 •

edited

Loading

corps commented Sep 24, 2024

ram-senth commented Sep 25, 2024

aayush-se commented Sep 26, 2024

ram-senth Sep 26, 2024

ram-senth commented Sep 26, 2024

Data Pruning for Anomaly Detection #1193

Are you sure you want to change the base?

Data Pruning for Anomaly Detection #1193

Conversation

aayush-se commented Sep 24, 2024 • edited Loading

corps commented Sep 24, 2024

ram-senth commented Sep 25, 2024

aayush-se commented Sep 26, 2024

ram-senth Sep 26, 2024

Choose a reason for hiding this comment

ram-senth commented Sep 26, 2024

aayush-se commented Sep 24, 2024 •

edited

Loading