-
Notifications
You must be signed in to change notification settings - Fork 33
How to Rerun Failed Pipeline Load
Follow this runbook to successfully reload data via BFD pipeline after a failed run.
Note: If there are pending deployments or db migrations, make sure those finish before running these steps.
-
SSH into the AWS ETL EC2 instance for a given environment
bfd-<test/prod/prod-sbx>-etl
withssh -i <local ssh key> <your ssh username>@<EC2 IP Address>
. -
Confirm the pipeline has failed to load data. You can investigate the failure by looking at the log located at
../../../bluebutton-data-pipeline/bluebutton-data-pipeline.log
on the EC2 instance and scrolling to the bottom. You can investigate the files that were loaded by checking in AWS:-
In AWS S3, the RIF folder (i.e.
<yyyy>-<MM>-<dd>T<HH>:<mm>:<ss>Z
) containing the data for reloading will still be in 'Incoming' with the file S3 file structure as:<S3 Bucket Name>-<aws-account-id> │ └───Incoming/ │ │ │ └───2022-09-23T13:44:55Z/ │ │ │ *_manifest.xml │ │ │ *.rif │ │ │ ... │ │ │ └───... │ └───Done/ │ │ │ └───...
The AWS S3 bucket name in the file structure above can be found within the ETL EC2 instance by running
grep S3_BUCKET_NAME /bluebutton-data-pipeline/bfd-pipeline-service.sh | cut -f2 -d=
.
-
-
Check if the pipeline is running with
sudo systemctl status bfd-pipeline
, and if so, stop it withsudo systemctl stop bfd-pipeline
. -
Investigate the issue and contact CCW if there is an issue upstream with the data provided, or fix the issue in pipeline if it was an application/configuration issue.
-
Once the issue is resolved, if the pipeline instance is still active, restart the pipeline from within the EC2 instance with
sudo systemctl start bfd-pipeline
. -
Confirm restarting the pipeline and loading data is successful:
-
The output of running
sudo systemctl status bfd-pipeline
should say "active(running) since …". -
As data is loading check the logs by running
tail /bluebutton-data-pipeline/bluebutton-data-pipeline.log -f
. -
When data is loaded properly, in AWS S3, the RIF folder containing the data for reloading will have automatically moved from 'Incoming' to 'Done' with the file S3 file structure as:
<S3 Bucket Name>-<aws-account-id> │ └───Incoming/ │ │ │ └───... │ └───Done/ │ │ │ └───2022-09-23T13:44:55Z/ │ │ │ *_manifest.xml │ │ │ *.rif │ │ │ ... │ │ │ └───...
-
- Home
- For BFD Users
- Making Requests to BFD
- API Changelog
- Migrating to V2 FAQ
- Synthetic and Synthea Data
- BFD SAMHSA Filtering