A small container to get an OMOP CDM Postgres database running quickly.
Drop your data into data/
, and run the container.
You can configure the Docker container using the following environment variables:
DB_HOST
: The hostname of the PostgreSQL database. Default isdb
.DB_PORT
: The port number of the PostgreSQL database. Default is5432
.DB_USER
: The username for the PostgreSQL database. Default ispostgres
.DB_PASSWORD
: The password for the PostgreSQL database. Default ispassword
.DB_NAME
: The name of the PostgreSQL database. Default isomop
.SCHEMA_NAME
: The name of the schema to be created/used in the database. Default isomop
.DATA_DIR
: The directory containing the data CSV files. Default isdata
.SYNTHETIC
: Load synthetic data (boolean). Default isfalse
docker run -v ./data:/data ghcr.io/health-informatics-uon/omop-lite
# docker-compose.yml
services:
omop-lite:
image: ghcr.io/health-informatics-uon/omop-lite
volumes:
- ./data:/data
depends_on:
- db
db:
image: postgres:latest
environment:
- POSTGRES_DB=omop
- POSTGRES_PASSWORD=password
ports:
- "5432:5432"
If you need synthetic data, some is provided in the synthetic
directory. It provides a small amount of data to load quickly.
To load the synthetic data, run the container with the SYNTHETIC
environment variable set to true
.
This data only provides the following tables:
CONCEPT
CONDITION_OCCURRENCE
MEASUREMENT
OBSERVATION
PERSON
You can provide your own data for loading into the tables by placing your files in the data/
directory. This should contain .csv
files matching the data tables (DRUG_STRENGTH.csv
, CONCEPT.csv
, etc.).
To match the vocabulary files from Athena, this data should be tab-separated, but as a .csv
file extension.
The setup.sh
script included in the Docker image will:
- Create the schema if it does not already exist.
- Execute the SQL files to set up the database schema, constraints, and indexes.
- Load data from the
.csv
files located in theDATA_DIR
.