Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Sentry monitoring for cron jobs #2275

Merged
merged 3 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions DEPLOY.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,19 @@ environment first to ensure that there are no issues with it. By convention, we
place the mapping files to be imported in
/var/lib/dokku/data/storage/mappings/bnfdmd which is mapped within the
container to /storage/mappings/bnfdmd.


### cron jobs

We use a mixture of dokku-managed (see [Backups](#Backups) ) and
[self-managed](https://dokku.com/docs/processes/scheduled-cron-tasks/#self-managed-cron)cron jobs.

For the latter, these are configured via a `cronfile` in `deploy/bin/`
which should be copied to `/etc/cron.d/` using `copy_cronfile.sh` as `sudo`.

All cron jobs (both dokku- and self-managed) call functions from `sentry_cron_functions.sh` to enable
Sentry cron job monitoring. The contents of `deploy/bin` should be copied to
`/var/lib/dokku/data/storage/opencodelists/deploy/bin` (outside container)
or `/storage/deploy/bin` (within container) to support this.

This sentry monitoring requires that the SENTRY_DSN environment variable is set (see above)
5 changes: 5 additions & 0 deletions app.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,10 @@
"type": "startup"
}
]
},
"scripts": {
"dokku": {
"predeploy": "mkdir -p /storage/deploy/bin && cp /app/deploy/bin/* /storage/deploy/bin/"
}
}
}
66 changes: 38 additions & 28 deletions deploy/bin/backup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,36 +2,46 @@

set -euxo pipefail

# DATABASE_DIR is configured via dokku (see DEPLOY.md)
BACKUP_DIR="$DATABASE_DIR/backup/db"
source sentry_cron_functions.sh
SENTRY_MONITOR_NAME="$0"
CRONTAB=$(extract_crontab "$SENTRY_MONITOR_NAME" "/app/app.json")
SENTRY_CRON_URL=$(sentry_cron_url "$SENTRY_DSN" "$SENTRY_MONITOR_NAME")
sentry_cron_start "$SENTRY_CRON_URL" "$CRONTAB"
{
# DATABASE_DIR is configured via dokku (see DEPLOY.md)
BACKUP_DIR="$DATABASE_DIR/backup/db"

# Make the backup dir if it doesn't exist.
mkdir "$BACKUP_DIR" -p
# Make the backup dir if it doesn't exist.
mkdir "$BACKUP_DIR" -p

# Take a datestamped backup.
BACKUP_FILENAME="$(date +%F)-db.sqlite3"
BACKUP_FILEPATH="$BACKUP_DIR/$BACKUP_FILENAME"
sqlite3 "$DATABASE_DIR/db.sqlite3" ".backup $BACKUP_FILEPATH"
# Take a datestamped backup.
BACKUP_FILENAME="$(date +%F)-db.sqlite3"
BACKUP_FILEPATH="$BACKUP_DIR/$BACKUP_FILENAME"
sqlite3 "$DATABASE_DIR/db.sqlite3" ".backup $BACKUP_FILEPATH"

# Compress the latest backup.
# Zstandard is a fast, modern, lossless data compression algorithm. It gives
# marginally better compression ratios than gzip on the backup and much faster
# compression and particularly decompression. We want the backup process to be
# quick as it's a CPU-intensive activity that could affect site performance.
# --rm flag removes the source file after compression.
zstd "$BACKUP_FILEPATH" --rm
# Compress the latest backup.
# Zstandard is a fast, modern, lossless data compression algorithm. It gives
# marginally better compression ratios than gzip on the backup and much faster
# compression and particularly decompression. We want the backup process to be
# quick as it's a CPU-intensive activity that could affect site performance.
# --rm flag removes the source file after compression.
zstd "$BACKUP_FILEPATH" --rm

# Symlink to the new latest backup to make it easy to discover.
# Make the target a relative path -- an absolute one won't mean the same thing
# in the host file system if executed inside a container as we expect.
ln -sf "$BACKUP_FILENAME.zst" "$BACKUP_DIR/latest-db.sqlite3.zst"
# Symlink to the new latest backup to make it easy to discover.
# Make the target a relative path -- an absolute one won't mean the same thing
# in the host file system if executed inside a container as we expect.
ln -sf "$BACKUP_FILENAME.zst" "$BACKUP_DIR/latest-db.sqlite3.zst"

# Keep only the last 30 days of backups.
# For now, apply this to both the original backup dir with backups based on the
# Django dumpdata management command and the new dir with backups based on
# sqlite .backup. Once there are none of the former remaining, the first line can be
# removed, along with most of this comment.
find "$DATABASE_DIR" -name "core-data-*.json.gz" -type f -mtime +30 -exec rm {} \;
# We initially compressed with gzip, this can be removed when none left.
find "$BACKUP_DIR" -name "*-db.sqlite3.gz" -type f -mtime +30 -exec rm {} \;
find "$BACKUP_DIR" -name "*-db.sqlite3.zst" -type f -mtime +30 -exec rm {} \;
# Keep only the last 30 days of backups.
# For now, apply this to both the original backup dir with backups based on the
# Django dumpdata management command and the new dir with backups based on
# sqlite .backup. Once there are none of the former remaining, the first line can be
# removed, along with most of this comment.
find "$DATABASE_DIR" -name "core-data-*.json.gz" -type f -mtime +30 -exec rm {} \;
# We initially compressed with gzip, this can be removed when none left.
find "$BACKUP_DIR" -name "*-db.sqlite3.gz" -type f -mtime +30 -exec rm {} \;
find "$BACKUP_DIR" -name "*-db.sqlite3.zst" -type f -mtime +30 -exec rm {} \;
sentry_cron_ok "$SENTRY_CRON_URL"
} ||{
sentry_cron_error "$SENTRY_CRON_URL"
}
15 changes: 15 additions & 0 deletions deploy/bin/copy_cronfile.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/bin/bash

# copy cron jobs from deploy bin directory
# to /etc/cron.d/ and populate env vars
# (must be set before running this script)

set -euo pipefail

BIN_DIR="/var/lib/dokku/data/storage/opencodelists/deploy/bin"

CRONFILE="$BIN_DIR/cronfile"
DEST="/etc/cron.d/dokku-opencodelists-cronfile"
sed "s/SLACK_WEBHOOK_URL\=dummy-url/SLACK_WEBHOOK_URL\=$SLACK_WEBHOOK_URL/g" "$CRONFILE" | \
sed "s/SLACK_TEAM_WEBHOOK_URL\=dummy-url/SLACK_TEAM_WEBHOOK_URL\=$SLACK_TEAM_WEBHOOK_URL/g" > \
"$DEST"
12 changes: 12 additions & 0 deletions deploy/bin/cronfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# /etc/cron.d/dokku-opencodelists-cronfile
# cron jobs to import latest releases
# update SLACK_WEBHOOK_URL and SLACK_TEAM_WEBHOOK_URL with relevant channel url

PATH=/usr/local/bin:/usr/bin:/bin
SHELL=/bin/bash
SLACK_WEBHOOK_URL=dummy-url
SLACK_TEAM_WEBHOOK_URL=dummy-url

5 23 * * 1 dokku /var/lib/dokku/data/storage/opencodelists/deploy/bin/run_cron.sh import_latest_release.sh dmd
5 23 * * 2 dokku /var/lib/dokku/data/storage/opencodelists/deploy/bin/run_cron.sh import_latest_release.sh snomedct
5 23 * * 3 dokku /var/lib/dokku/data/storage/opencodelists/deploy/bin/run_cron.sh import_latest_release.sh mappings.bnfdmd
10 changes: 0 additions & 10 deletions deploy/bin/import_latest_bnfdmd_cronfile

This file was deleted.

10 changes: 0 additions & 10 deletions deploy/bin/import_latest_dmd_cronfile

This file was deleted.

15 changes: 11 additions & 4 deletions deploy/bin/import_latest_release.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash

set -euo pipefail

source sentry_cron_functions.sh
# NOTE: this script is run by cron (as the dokku user) weekly
# For dm+d, it is run every Monday night, to coincide with weekly
# dm+d releases
Expand All @@ -15,9 +15,6 @@ set -euo pipefail
# Updates to coding systems require restarting the dokku app, so this job is
# not dokku-managed

# This script should be copied to /var/lib/dokku/data/storage/opencodelists/import_latest_release.sh
# on dokku3 and run using the cronfile at opencodelists/deploy/bin/import_latest_dmd_cron

# SLACK_WEBHOOK_URL and SLACK_TEAM_WEBHOOK_URL are environment variables set in the cronfile on dokku3
# General notification messages (import start, complete etc) are posted to the
# SLACK_WEBHOOK_URL channel (#tech-noise). Failures are posted to the
Expand All @@ -26,6 +23,11 @@ set -euo pipefail

CODING_SYSTEM=$1

SENTRY_MONITOR_NAME="{$0}_{$1}"
SENTRY_DSN=$(dokku config:get opencodelists SENTRY_DSN)
CRONTAB=$(extract_crontab "$CODING_SYSTEM" "cronfile")
SENTRY_CRON_URL=$(sentry_cron_url "$SENTRY_DSN" "$SENTRY_MONITOR_NAME")

REPO_ROOT="/app"
DOWNLOAD_DIR="/storage/data/${CODING_SYSTEM}"
# make the log dir if necessary
Expand Down Expand Up @@ -56,6 +58,7 @@ function run_dokku_import_command () {
function post_starting_message() {
starting_message_text="Starting OpenCodelists import of latest ${CODING_SYSTEM}"
post_to_slack "${starting_message_text}" "${SLACK_WEBHOOK_URL}"
sentry_cron_start "$SENTRY_CRON_URL" "$CRONTAB"
}


Expand All @@ -67,6 +70,8 @@ function post_success_message_and_cleanup() {
post_to_slack "${success_message_text}" "${SLACK_WEBHOOK_URL}"
# remove log file; only persist log files for errors
rm "${LOG_FILE}"
# log success with sentry
sentry_cron_ok "$SENTRY_CRON_URL"
}


Expand All @@ -85,6 +90,8 @@ import_coding_system_data ${CODING_SYSTEM} ${DOWNLOAD_DIR} \
--valid-from <YYYY-MM-DD> \
--force && dokku ps:restart opencodelists\`\`\`"
post_to_slack "${failure_message_text}" "${SLACK_TEAM_WEBHOOK_URL}"
# report failure to sentry
sentry_cron_error "$SENTRY_CRON_URL"
}


Expand Down
10 changes: 0 additions & 10 deletions deploy/bin/import_latest_snomedct_cronfile

This file was deleted.

48 changes: 48 additions & 0 deletions deploy/bin/sentry_cron_functions.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/bin/bash

CRONTAB_PATTERN="([\*\d]+ )+"

function extract_crontab() {
JOB_IDENTIFIER="$1"
CRONTAB_SOURCE="$2"
# the crontab schedule is on the line after the command in app.json
# and on the same line in the cronfile
if [[ "$CRONTAB_SOURCE" == *json ]];
then
LINES_AFTER_MATCH=2
else
LINES_AFTER_MATCH=1
fi
CRONTAB=$(grep -A "$LINES_AFTER_MATCH" "$JOB_IDENTIFIER" "$CRONTAB_SOURCE" | grep -oP "$CRONTAB_PATTERN")
echo "$CRONTAB"
}

# Sentry DSN as used in other bits of Sentry monitoring in this project
# is not the same as the API endpoint for Cron Monitoring.
# This function modifies it such that it looks like what's described
# at https://docs.sentry.io/product/crons/getting-started/http/
Jongmassey marked this conversation as resolved.
Show resolved Hide resolved
function sentry_cron_url() {
SENTRY_DSN="$1"
JOB_NAME="$2"
# modify the DSN to point it at the cron API
SENTRY_CRON_URL=$(sed -E "s/([0-9]+$)/api\/\1/g" <<< "$SENTRY_DSN")
echo "$SENTRY_CRON_URL/cron/$JOB_NAME"
}

function sentry_cron_start() {
SENTRY_CRON_URL="$1"
CRONTAB="$2"
curl -X POST "$SENTRY_CRON_URL" \
--header 'Content-Type: application/json' \
--data-raw "{\"monitor_config\": {\"schedule\": {\"type\": \"crontab\", \"value\": \"$CRONTAB\"}}, \"status\": \"in_progress\"}"
}

function sentry_cron_ok() {
SENTRY_CRON_URL="$1"
curl "$SENTRY_CRON_URL?status=ok"
}

function sentry_cron_error() {
SENTRY_CRON_URL="$1"
curl "$SENTRY_CRON_URL?status=error"
}
62 changes: 62 additions & 0 deletions opencodelists/tests/test_sentry_cron_functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
import json
import subprocess


# Dummy organisation and project identifiers in URL
SENTRY_DSN = "https://[email protected]/7891023"
SENTRY_CRON_URL = "https://[email protected]/api/7891023/cron/test_monitor"
CRONTAB = "5 23 * * 1 "


def test_extract_crontab_json(tmp_path):
app_json = tmp_path / "app.json"
with app_json.open("w") as f:
cron_dict = {
"cron": [
{
"command": "test_command",
"schedule": CRONTAB,
}
]
}
json.dump(cron_dict, f, indent=1)

proc = subprocess.run(
f"source deploy/bin/sentry_cron_functions.sh; extract_crontab test_command {app_json}",
shell=True,
capture_output=True,
executable="/bin/bash",
)

assert proc.stdout.decode().strip("\n") == CRONTAB


def test_extract_crontab_cronfile(tmp_path):
cronfile = tmp_path / "cronfile"
with cronfile.open("w") as f:
lines = [
"#test comment",
"ENV_VAR=VAR_ENV",
f"{CRONTAB} test_user test_command test_arg",
]
f.writelines([line + "\n" for line in lines])

proc = subprocess.run(
f"source deploy/bin/sentry_cron_functions.sh; extract_crontab test_command {cronfile}",
shell=True,
capture_output=True,
executable="/bin/bash",
)

assert proc.stdout.decode().strip("\n") == CRONTAB


def test_sentry_cron_url():
proc = subprocess.run(
f"source deploy/bin/sentry_cron_functions.sh; sentry_cron_url {SENTRY_DSN} test_monitor",
shell=True,
capture_output=True,
executable="/bin/bash",
)

assert proc.stdout.decode().strip("\n") == SENTRY_CRON_URL
Loading