Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PollingAlertCheckWorker to copy with 429s blocking Notify callbacks #2084

Merged
merged 3 commits into from
Jan 22, 2024

Conversation

KludgeKML
Copy link
Contributor

The problem of blocks of 429s sometimes being returned when notify attempts to make callbacks for delivered emails means that we're at risk of getting false alarms for emails not being sent when they actually have been, but the callback has failed. As a fallback, we add in an active polling step.

  • If a single positive callback has been recorded, we act as normal (no alert).
  • If no positive callbacks have been records, we start actively polling Notify with the IDs (reference ids in Notify parlance) of emails that we've had no callback for, querying whether they have been received. We expect in most cases this will result in one or two calls to Notify before a positive delivery is found, although the worst case (no emails delivered at all) would result in us asking Notify about every single email.

https://trello.com/c/KR2THVrN/2323-add-active-polling-to-mitigate-false-alarms-from-429-responses

⚠️ This repo is Continuously Deployed: make sure you follow the guidance ⚠️

@KludgeKML KludgeKML changed the title Plan B for RFC-164 Add PollingAlertCheckWorker to copy with 429s blocking Notify callbacks Jan 18, 2024
Copy link
Contributor

@1pretz1 1pretz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work on this Keith! Overall everything looks great, just left a few comments.

We should also remove AlertCheckWorker, the spec.

Also need to update the testing documentation, I don't think the current method would suffice anymore as the canary account will receive the email and Notify should respond with the email delivery. I think we could probably achieve the same test by just deleting all the emails vs marking the Notify status as nil.

app/services/check_notify_email_service.rb Show resolved Hide resolved
app/services/check_notify_email_service.rb Outdated Show resolved Hide resolved
app/workers/polling_alert_check_worker.rb Outdated Show resolved Hide resolved
spec/workers/polling_alert_check_worker_spec.rb Outdated Show resolved Hide resolved
spec/workers/polling_alert_check_worker_spec.rb Outdated Show resolved Hide resolved
spec/workers/polling_alert_check_worker_spec.rb Outdated Show resolved Hide resolved
spec/workers/polling_alert_check_worker_spec.rb Outdated Show resolved Hide resolved
app/services/check_notify_email_service.rb Outdated Show resolved Hide resolved
@KludgeKML KludgeKML force-pushed the plan-b branch 5 times, most recently from 2b90548 to 2e19485 Compare January 22, 2024 12:20
Copy link
Contributor

@1pretz1 1pretz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, thanks for working through the comments! Left one minor suggestion but non blocking so approving 🍏


it "doesn't poll Notify once it's found a delivered email" do
perform
expect(a_request(:get, "https://api.notifications.service.gov.uk/v2/notifications?reference=#{@no_poll_email.id}&status=delivered&template_type=email")).not_to have_been_made
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably clean up the code slightly here:

        email3 = create(:email, content_id:, notify_status: nil, status: "sent")
        request = stub_notify_email_delivery_status_delivered(email3.id)
        perform
        expect(request).not_to have_been_made

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- AlertCheckWorker likely to cause false alarms due to the callbacks being 429'd, so add a backup where if we have no delivered emails, we actively poll Notify for details of the first (up to)  100 emails until we find a delivered one.
@KludgeKML KludgeKML merged commit 22d9a4c into main Jan 22, 2024
10 checks passed
@KludgeKML KludgeKML deleted the plan-b branch January 22, 2024 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants