-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(crons): Record historic check-in volume counts #79448
base: master
Are you sure you want to change the base?
feat(crons): Record historic check-in volume counts #79448
Conversation
535e3b2
to
f5759f0
Compare
have had some data-loss (due to an incident) and would want to tick our | ||
clock in a mode where misses and time-outs are created as "unknown". | ||
""" | ||
redis_client = redis.redis_clusters.get(settings.SENTRY_MONITORS_REDIS_CLUSTER) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is using rc-default-cache
in US, I wonder if we should try and migrate to our own cluster so that we're more isolated
MONITOR_VOLUME_HISTORY = "sentry.monitors.volume_history:{}" | ||
|
||
# We record 30 days worth of historical data for each minute of check-ins. | ||
MONITOR_VOLUME_RETENTION = timedelta(days=30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we keep keys every minute, this will be 43k keys. Could be a good reason to have our own cluster.
pipeline.incr(key, amount=1) | ||
pipeline.expire(key, MONITOR_VOLUME_RETENTION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just probabilistically set the expire
10% of the time to nearly halve the number of writes. Probably no need if we do the batching option though
@@ -937,6 +937,7 @@ def process_checkin_group(items: list[CheckinItem]): | |||
completely serially. | |||
""" | |||
for item in items: | |||
update_check_in_volume(item.ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're doing this here, I think it'd likely be better to just pass all the timestamps to the function in bulk. The single process method can just pass a list of one item.
Actually, it'd be even better to just do this in process_batch
. Just store all the timestamps after we decode the messages, and then call bulk_update_check_in_volume
with all of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah good call
Part of GH-79328