GitHub - BLSQ/openhexa-pipelines-era5

The repository contains OpenHEXA ETL pipelines to ingest climate data from the ERA5-Land dataset in the Climate Data Store. They heavily rely on the openhexa.toolbox.era5 package (see openhexa-toolbox repo for more info).

Three DAGs are available:

era5_extract: download/sync raw ERA5 hourly data from the CDS for a given area of interest
era5_aggregate: aggregate raw hourly data in space and time according to an input geographic file (ex: administrative boundaries)
era5_import_dhis2: import ERA5 aggregated climate statistics into DHIS2 datasets

Pipelines documentation is available in the respective subdirectories.

Deployment

To deploy the pipelines to an OpenHEXA workspace, edit the .github/workflows/<pipeline_name>.yml file accordingly. The OpenHEXA workspace token must be stored in a GitHub Actions secret in the repository settings.

In the following example, the era5_extract pipeline is automatically deployed 3 times (once per climate variable) to 3 different workspaces:

jobs:
  deploy:
    strategy:
      matrix:
        pipeline: [
          {"workspace": "civ-data-integration-3cfb03", "pipeline_id": "era5_extract_temperature", "token": OH_TOKEN_CIV},
          {"workspace": "civ-data-integration-3cfb03", "pipeline_id": "era5_extract_precipitation", "token": OH_TOKEN_CIV},
          {"workspace": "civ-data-integration-3cfb03", "pipeline_id": "era5_extract_humidity", "token": OH_TOKEN_CIV},
          {"workspace": "bfa-malaria-data-reposi-b1b366", "pipeline_id": "era5_extract_temperature", "token": OH_TOKEN_BFA},
          {"workspace": "bfa-malaria-data-reposi-b1b366", "pipeline_id": "era5_extract_precipitation", "token": OH_TOKEN_BFA},
          {"workspace": "bfa-malaria-data-reposi-b1b366", "pipeline_id": "era5_extract_humidity", "token": OH_TOKEN_BFA},
          {"workspace": "niger-nmdr", "pipeline_id": "era5_extract_temperature", "token": OH_TOKEN_NER},
          {"workspace": "niger-nmdr", "pipeline_id": "era5_extract_precipitation", "token": OH_TOKEN_NER},
          {"workspace": "niger-nmdr", "pipeline_id": "era5_extract_humidity", "token": OH_TOKEN_NER}
        ]

New pipeline versions will be automatically deployed to the workspaces listed in the job matrix after each push to the main branch.

Flow

graph TB
  A[(Climate Data Store)] --> B
  B["**ERA5 Extract**"] --> D[/"Gridded hourly data (GRIB2)"/]
  D --> E["**ERA5 Aggregate**"]
  E --> F[/"Aggregated data files (Parquet)"/]
  F --> G["**ERA5 Import DHIS2**"]
  G --> H[("DHIS2")]

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github/workflows		.github/workflows
era5_aggregate		era5_aggregate
era5_extract		era5_extract
era5_import_dhis2		era5_import_dhis2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deployment

Flow

About

Releases

Packages

Languages

License

BLSQ/openhexa-pipelines-era5

Folders and files

Latest commit

History

Repository files navigation

Deployment

Flow

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages