This document will walk you through the setup instructions to get a functioning development environment.
- Table of contents
- Locally installed software prerequisites
- Bulding the Baseline
- Building the Baseline
- Committing to this repository
- Running ReportStream
- Function development with docker-compose
- How to use the CLI
- Credentials and secrets vault
- Testing
- Resetting your environment
- Additional tooling
- Miscellaneous subjects
You will need to have at least the following pieces of software installed locally in order to be able to build and/or debug this baseline:
- git including git-bash if you're on Windows
- Docker or Docker Desktop
- OpenJDK (currently targetting 11 through 15)
- Azure Functions Core Tools (currently targetting 3)
The following are optional tools that can aid you during development or debugging:
- Azure Storage Explorer
- AzureCLI
- Gradle
- One or more PostgreSQL Clients
-
Clone the prime-reportstream repository to your workstation using git.
-
If you are using Docker Desktop, verify that it is running prior to building or running ReportStream locally.
-
Initialize your environment and run an initial build by running the following command using a Linux shell. Note you can run
cleanslate.sh
script to reset your environment as well (run./cleanslate.sh --help
for more information). Thecleanslate.sh
script not only cleans, but also performs a first build and setup of all artifacts needed for the build
cd ./prime-router
./cleanslate.sh
Note: If you are working on an Apple Silicon Mac, stop here at this step and continue on with the instructions in Using an Apple Silicon Mac.
- If you are using Docker Desktop, verify that it is running prior to building or running ReportStream locally.
- Building and running ReportStream requires a locally accessible PostgreSQL database instance that is initially setup
and run by the
cleanslate.sh
script. This database instance runs as a Docker container defined by thedocker-compose.build.yml
file. You will need to start this database instance upon a workstation reboot by using the following command:
cd ./prime-router
docker-compose --file "docker-compose.build.yml" up --detach
You can invoke gradlew
from the ./prime-router
directory to build the baseline as follows:
./gradlew clean package
The most useful gradle tasks are:
clean
: deletes the build artifactscompile
: compiles the codetest
: runs the unit teststestIntegration
: runs the integration testspackage
: packages the build artifacts for deploymentquickpackage
: re-packages the build artifacts for deployment without running the teststestSmoke
: runs all the smoke tests; this requires that you are running ReportStreamtestEnd2End
: runs the end-to-end test; this requires that you are running ReportStreamprimeCLI
: run the prime CLI. Specify arguments with"--args=<args>"
- Commits must be signed or will not be mergeable into
master
orproduction
without Repository Administrator intervention. You can find detailed instructions on how to set this up in the Signing Commits document. - Make your changes in topic/feature branches and file a new Pull Request to merge your changes in to your desired target branch.
We make use of git hooks in this repository and rely on them for certain levels of protections against CI/CD failures and other incidents. Install/activate these hooks by invoking either prime-router/cleanslate.sh
or by directly invoking .environment/githooks.sh install
. This is a repository-level setting, you must activate the git hooks in every clone on every device you have.
The first hook we'll invoke is to ensure Docker is running. If it's not we'll short-circuit the remainder of the hooks and let you know why.
Gitleaks is one of the checks that are run as part of the pre-commit
hook. It must pass successfully for the commit to proceed (i.e. for the commit to actually happen, failure will prevent the commit from being made and will leave your staged files in staged status). Gitleaks scans files that are marked as "staged" (i.e. git add
) for known patterns of secrets or keys.
The output of this tool consists of 2 files, both in the root of your repository, which can be inspected for more information about the check:
gitleaks.report.json
: the details about any leaks it finds, serialized as JSON. If no leaks are found, this file contains the literal "null
"; if leaks are found, then this file will contain an array of found leak candidates.gitleaks.log
: the simplified logging output of the gitleaks tool
When gitleaks reports leaks/violations, the right course of action is typically to remove the leak and replace it with a value that is collected at run-time. There are limited cases where the leak is a false positive, in which case a strict and narrow exemption may be added to the .environment/gitleaks/gitleaks-config.toml
configuration file. If an exemption is added, it must be signed off on by a member of the DevOps team.
This tool can also be manually invoked through .environment/gitleaks/run-gitleaks.sh
which may be useful to validate the lack of leaks without the need of risking a commit. Invoke the tool with --help
to find out more about its different run modes.
See Allow-listing Gitleaks False Positives for more details on how to prevent False Positives!
If you've changed any terraform files in your commit we'll run
terraform fmt -check
against the directory of files. If any file's format is invalid
the pre-commit hook will fail. You may be able to fix the issues with:
$ terraform fmt -recursive
You must run the schema document generator after a schema file is updated. The updated documents are stored in
docs/schema-documentation
and must be included with your schema changes. The CI/CD pipeline checks for the need to update
schema documentation and the build will fail if the schema documentation updates are not included.
./gradlew generateDocs
You can bring up the entire ReportStream environment by running the devenv-infrastructure.sh
script after building
the baseline (see "First Build")
cd ./prime-router
./gradlew clean package
./devenv-infrastructure.sh
If you see any SSL errors during this step, follow the directions in Getting Around SSL Errors.
You must re-package the build and restart the prime_dev container to see any modifications you have made to the files:
cd ./prime-router
./gradlew package
docker-compose restart prime_dev
The docker containers produce logging output. When dealing with failures or bugs, it can be very useful to inspect this output. You can inspect the output of each container using ('$
' indicates your prompt):
# List PRIME containers:
$ docker ps --format '{{.Names}}' | grep ^prime-router
prime-router_web_receiver_1
prime-router_prime_dev_1
prime-router_sftp_1
prime-router_azurite_1
prime-router_vault_1
prime-router_postgresql_1
# Show the log of (e.g.) prime-router_postgresql_1 until now
docker logs prime-router_postgresql_1
# Show the log output of (e.g.) prime-router-prime_dev_1 and stay on it
docker logs prime-router_prime_dev_1 --follow
To change the level of logging in our kotlin code, edit the src/main/resources/log4j2.xml file. For example, to get very verbose logging across all classes:
<Logger name="gov.cdc.prime.router" level="trace"/>
To increase the level of Azure Function logging (Microsoft's logging), edit the 'logging' section of the host.json file and add a logLevel section, like this:
"logging": {
"logLevel": {
"default": "Trace"
},
"applicationInsights": {
"samplingSettings": {
"isEnabled": true
}
}
}
The Docker container running ReportStream exposes local port 5005
for remote Java debugging. Connect your
debugger to localhost:5005
while the Docker container is running and set the necessary breakpoints.
ReportStream comes packaged with a executable that can help with finding misconfigurations and other problems with the appliciation. Use the following command to launch the tool locally while the ReportStream container is running:
cd prime-router
# Specify these explicitly as exports or as command-scope variables
export POSTGRES_PASSWORD='changeIT!'
export POSTGRES_URL=jdbc:postgresql://localhost:5432/prime_data_hub
export POSTGRES_USER=prime
./prime test
Running this test command (pointed at the right database) should "repair" a running ReportStream process and should persist through subsequent runs.
If your agency's network intercepts SSL requests, you might have to disable SSL verifications to get around invalid certificate errors.
Most uses of the PRIME router will be in the Microsoft Azure cloud. The router runs as a container in Azure. The DockerFile
describes what goes in this container.
Developers can also run the router locally with the same Azure runtime and libraries to help develop and debug in an environment that mimics the Azure environment as closely as we can on your local machine. In this case, a developer can use a local Azure storage emulator, called Azurite.
We use docker-compose' to orchestrate running the Azure function(s) code and Azurite. See sections "Running ReportStream" for more information on building and bringing your environment up.
If you see any SSL errors during this step, follow the directions in Getting Around SSL Errors.
The PRIME command line interface allows you to interact with certain parts of report stream functionality without using the API or running all of ReportStream. A common use case for the CLI is testing while developing mappers for the new FHIR pipeline.
The primary way to access the cli is through the gradle command (although a deprecated bash script exists as well). If you are an IntelliJ user, you can set up the gradle command to be run through your IDE and be run in debug mode to step through your code line by line.
cd ./prime-router
# Prints out all the available commands
./gradlew primeCLI
# data process data
# list list known schemas, senders, and receivers
# livd-table-download It downloads the latest LOINC test data, extract
# Lookup Table, and the database as a new version.
# generate-docs generate documentation for schemas
# create-credential create credential JSON or persist to store
# compare compares two CSV files so you can view the
# differences within them
# test Run tests of the Router functions
# login Login to the HHS-PRIME authorization service
# logout Logout of the HHS-PRIME authorization service
# organization Fetch and update settings for an organization
# sender Fetch and update settings for a sender
# receiver Fetch and update settings for a receiver
# multiple-settings Fetch and update multiple settings
# lookuptables Manage lookup tables
# convert-file
# sender-files For a specified report, trace each item's ancestry
# and retrieve the source files submitted by
# senders.
# fhirdata Process data into/from FHIR
# fhirpath Input FHIR paths to be resolved using the input
# FHIR bundle
# convert-valuesets-to-csv This is a development tool that converts
# sender-automation.valuesets to two CSV files
# Converts HL7 to FHIR (IN DEV MODE)
./gradlew primeCLI --args='fhirdata --input-file "src/testIntegration/resources/datatests/HL7_to_FHIR/sample_co_1_20220518-0001.hl7"'
# Converts the FHIR file to HL7 using the provided schema (IN DEV MODE)
./gradlew primeCLI --args='fhirdata --input-file "src/testIntegration/resources/datatests/HL7_to_FHIR/sample_co_1_20220518-0001.fhir" -s metadata/hl7_mapping/ORU_R01/ORU_R01-base.yml'
Our docker-compose.yml
includes a Hashicorp Vault instance alongside our other containers to enable local secrets storage. Under normal circumstances, developers will not have to interact directly with the Vault configuration.
This vault is used locally to provide the SFTP credentials used by the Send function to upload files to the locally running atmoz/sftp SFTP server.
NOTE: the cleanslate.sh script will set this up for you (see also "First build" and "Resetting your environment").
Run the following commands to initialize vault:
mkdir -p .vault/env
cat /dev/null > .vault/env/.env.local
When starting up our containers with docker-compose up
on first-run, the container will create a new Vault database and once initialized (which may take a couple of seconds) store the following files in .vault/env
:
key
: unseal key for decrypting the database.env.local
: the root token in envfile format for using the Vault api/command line
The database is stored in a docker-compose container vault
which is persisted across up and down events. All files are excluded in .gitignore
and should never be persisted to source control.
NOTE: the cleanslate.sh script will re-initialize your vault for you (see also "Resetting your environment").
If you would like to start with a fresh Vault database, you can clear the Vault database with one of the following commands sets:
Using cleanslate.sh
:
cd ./prime-router
./cleanslate.sh --keep-images --keep-build-artifacts
Manually:
cd prime-router
# -v removes ALL volumes associated with the environment
docker-compose down -v
rm -rf .vault/env/{key,.env.local}
cat /dev/null > .vault/env/.env.local
Our docker-compose.yml
will automatically load the environment variables needed for the Vault. If you need to use the Vault outside Docker, you can find the environment variables you need in .vault/env/.env.local
.
When your Vault is up and running (exemplified by .vault/env/.env.local
being populated with two environment variables: VAULT_TOKEN
and CREDENTIAL_STORAGE_METHOD
), you can interact with it in a couple of ways:
-
Graphical/Web UI: Navigate to http://localhost:8200 and provide the value of
VAULT_TOKEN
to log into the vault. -
curl/HTTP API to get JSON back, example:
export $(xargs <.vault/env/.env.local) SECRET_NAME=DEFAULT-SFTP URI=http://localhost:8200/v1/secret/${SECRET_NAME?} curl --header "X-Vault-Token: ${VAULT_TOKEN?}" "${URI?}"
The values from the .vault/env/.env.local file can also be automatically loaded into most IDEs:
Alternatively, you can inject them into your terminal via:
export $(xargs <./.vault/env/.env.local)
The build will run the unit tests for you when you invoke ./gradlew package
. However, you may sometimes want to invoke them explicitly. Use this command to do run the Unit Tests manually:
cd ./prime-router
./gradlew test
# Or to force the tests to run
./gradlew test -Pforcetest
The quick test is meant to test the data conversion and generation code. Use this following command to run all quick tests, which you should do as part of a Pull Request:
./quick-test.sh all
End-to-end tests check if the deployed system is configured correctly. The tests use an organization called IGNORE for running the tests. In order to successfully run the end-to-end tests, you will need to:
-
Have built successfully
-
Export the vault's credentials
cd ./prime-router export $(xargs < .vault/env/.env.local)
-
Create the SFTP credentials and upload organizations' settings
cd ./prime-router ./prime create-credential --type=UserPass \ --persist=DEFAULT-SFTP \ --user foo \ --pass pass
-
Ensure that your docker containers are running (see also "Running ReportStream")
cd ./prime-router # Specify restart if they are already running and you want # them to pick up new bianries # i.e. ./devenv-infrastructure.sh restart ./devenv-infrastructure.sh
-
Run the tests
./gradlew testEnd2End
or
./prime test --run end2end
Or to run the entire smoke test suite locally:
./prime test
Upon completion, the process should report success.
To run the end2end test on Staging you'll need a <postgres-user>
and <postgres-password>
, VPN tunnel access, and a <reports-endpoint-function-key>
With your VPN running, do the following:
export POSTGRES_PASSWORD=<postgres-password>
export POSTGRES_USER= <postgres-user>@pdhstaging-pgsql
export POSTGRES_URL=jdbc:postgresql://pdhstaging-pgsql.postgres.database.azure.com:5432/prime_data_hub
./prime test --run end2end --env staging --key <reports-endpoint-function-key>
To run the entire smoke test suite on Staging use this:
./prime test -env staging --key <reports-endpoint-function-key>
You can run the ./cleanslate.sh
script to recover from an unreliable or messed up environment. Run the script with --help
to learn about its different levels of 'forcefulness' and 'graciousness' in its cleaning repertoire:
cd ./prime-router
# default mode:
./cleanslate.sh
# most forceful mode
./cleanslate.sh --prune-volumes
# Show the different modes for 'graciousness'
./cleanslate.sh --help
When invoked with --prune-volumes
, this script will also reset your PostgreSQL database. This can be useful to get back to a known and/or empty state.
- Stop your ReportStream container if it is running.
docker-compose down
- Run the following command to delete all ReportStream related tables from the database and recreate them. This
is very useful to reset your database to a clean state. Note that the database will be re-populated the
next time you run ReportStream using docker-compose up.
./gradlew resetDB
- Run ReportStream and run the following commands to load the tables and organization settings into the database:
./gradlew reloadTables ./gradlew reloadSettings
Use any other tools that are accessible to you to develop the code. Be productive. Modify this document if you have a practice that will be useful.
Some useful tools for Kotlin/Java development include:
- Azure Storage Explorer
- JetBrains IntelliJ
- KTLint: the Kotlin linter that we use to format our KT code
- Install the IntelliJ KLint plugin or configure it to follow standard Kotlin conventions as follows on a mac:
cd ./prime-router && brew install ktlint && ktlint applyToIDEAProject
- Install the IntelliJ KLint plugin or configure it to follow standard Kotlin conventions as follows on a mac:
- Microsoft VSCode with the available Kotlin extension
- Java Profiling in ReportStream
- Tips for faster development
In cases where you want to change which credentials are used by any of our tooling to connect to the PostgreSQL database, you can do so by specifying some environment variables or specifying project properties to the build. The order of precendence of evaluation is: Project Property beats Environment Variable beats Default
. You can specify the following variables:
DB_USER
: PostgreSQL database username (defaults toprime
)DB_PASSWORD
: PostgreSQL database password (defaults tochangeIT!
)DB_URL
: PostgreSQL database URL (defaults to
jdbc:postgresql://localhost:5432/prime_data_hub
)
Example:
# exported environment variable DB_URL
export DB_URL=jdbc:postgresql://postgresql:5432/prime_data_hub
# Command-level environment variable (DB_PASSWORD)
# combined with project property (DB_USER)
DB_PASSWORD=mypassword ./gradlew testEnd2End -PDB_USER=prime
Alternatively, you can specify values for project properties via environment variables per the Gradle project properties environment ORG_GRADLE_PROJECT_<property>
:
export ORG_GRADLE_PROJECT_DB_USER=prime
export ORG_GRADLE_PROJECT_DB_PASSWORD=mypass
./gradlew testEnd2End -PDB_URL=...
By default, the functions will pull their configuration for organizations from the organizations.yml
file. This can be overridden locally or in test by declaring an environment variable PRIME_ENVIRONMENT
to specify the suffix of the yml file to use. This enables setting up local SFTP routing without impacting the 'production' organizations.yml
configuration file.
# use organizations-mylocal.yml instead
export PRIME_ENVIRONMENT=mylocal
# use organizations-foo.yml instead
export PRIME_ENVIRONMENT=foo
When building the ReportStream container, you can set this value to true
to enable insecure SSL:
PRIME_DATA_HUB_INSECURE_SSL=true docker-compose build
- SFTP Upload Permission denied - If you get a Permission Denied exception in the logs then it is most likely the atmoz/sftp Docker container has the incorrect permissions for the folder used by the local SFTP server.
FAILED Sftp upload of inputReportId xxxx to SFTPTransportType(...) (orgService = ignore.HL7), Exception: Permission denied
Run the following command to change the permissions for the folder:
docker exec -it prime-router_sftp_1 chmod 777 /home/foo/upload