Skip to content

Commit

Permalink
Merge branch 'develop' of github.com:ecmwf/anemoi-datasets into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
b8raoult committed Oct 15, 2024
2 parents 72b266c + 1d96021 commit eb63372
Show file tree
Hide file tree
Showing 24 changed files with 468 additions and 101 deletions.
6 changes: 3 additions & 3 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# CODEOWNERS file

# Protect workflow files
/.github/ @theissenhelen @jesperdramsch @gmertes @b8raoult @floriankrb
/.pre-commit-config.yaml @theissenhelen @jesperdramsch @gmertes @b8raoult @floriankrb
/pyproject.toml @theissenhelen @jesperdramsch @gmertes @b8raoult @floriankrb
/.github/ @theissenhelen @jesperdramsch @gmertes @b8raoult @floriankrb @anaprietonem @HCookie @JPXKQX @mchantry
/.pre-commit-config.yaml @theissenhelen @jesperdramsch @gmertes @b8raoult @floriankrb @anaprietonem @HCookie @JPXKQX @mchantry
/pyproject.toml @theissenhelen @jesperdramsch @gmertes @b8raoult @floriankrb @anaprietonem @HCookie @JPXKQX @mchantry
20 changes: 6 additions & 14 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,15 @@ assignees: ''
**Describe the bug**
A clear and concise description of what the bug is.

** Version number **
I am using the following versions/branch/sha1 of the anemoi packages
(alternatively the output of `pip freeze`)

**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
2. Run this '....'
3. See error

**URL to sample input data**
Provide a URL to a sample input data, or attach a file to that report if it is small enough.
Expand All @@ -25,16 +28,5 @@ A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.

**Desktop (please complete the following information):**
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]

**Smartphone (please complete the following information):**
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]

**Additional context**
Add any other context about the problem here.
6 changes: 0 additions & 6 deletions .github/ci-config.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,3 @@
dependencies: |
ecmwf/ecbuild
MathisRosenhauer/libaec@master
ecmwf/eccodes
ecmwf/eckit
ecmwf/odc
dependency_branch: develop
parallelism_factor: 8
self_build: false # Only for python packages
9 changes: 0 additions & 9 deletions .github/ci-hpc-config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,8 @@ build:
python: '3.10'
modules:
- ninja
dependencies:
- ecmwf/ecbuild@develop
- ecmwf/eccodes@develop
- ecmwf/eckit@develop
- ecmwf/odc@develop
python_dependencies:
- ecmwf/anemoi-utils@develop
- ecmwf/earthkit-data@develop
- ecmwf/earthkit-meteo@develop
- ecmwf/earthkit-geo@develop
parallel: 64

pytest_cmd: |
python -m pytest -vv -m 'not notebook and not no_cache_init' --cov=. --cov-report=xml
20 changes: 19 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,31 @@ Keep it human-readable, your future self will thank you!

## [Unreleased](https://github.com/ecmwf/anemoi-datasets/compare/0.5.7...HEAD)


### Added

- Add anemoi-transform link to documentation
- Various bug fixes
- Control compatibility check in xy/zip
- Add `merge` feature

### Changed

- Remove upstream dependencies from downstream-ci workflow (temporary) (#83)

## [0.5.7](https://github.com/ecmwf/anemoi-datasets/compare/0.5.6...0.5.7) - 2024-10-09

### Changed

- Add support to fill missing dates

## [Allow for unknown CF coordinates](https://github.com/ecmwf/anemoi-datasets/compare/0.5.5...0.5.6) - 2024-10-04

- Update documentation
### Changed

- Add `variables_metadata` entry in the dataset metadata
- Update documentation

### Changed

- Add `variables_metadata` entry in the dataset metadata
Expand Down
4 changes: 4 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,10 @@
"https://anemoi-registry.readthedocs.io/en/latest/",
("../../anemoi-registry/docs/_build/html/objects.inv", None),
),
"anemoi-transform": (
"https://anemoi-transform.readthedocs.io/en/latest/",
("../../anemoi-transform/docs/_build/html/objects.inv", None),
),
}


Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ datasets <building-introduction>`.
*****************

- :ref:`anemoi-utils <anemoi-utils:index-page>`
- :ref:`anemoi-transform <anemoi-transform:index-page>`
- :ref:`anemoi-datasets <anemoi-datasets:index-page>`
- :ref:`anemoi-models <anemoi-models:index-page>`
- :ref:`anemoi-graphs <anemoi-graphs:index-page>`
Expand Down
1 change: 1 addition & 0 deletions docs/using/code/fill_missing_dates1_.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ds = open_dataset(dataset, fill_missing_dates="interpolate")
1 change: 1 addition & 0 deletions docs/using/code/fill_missing_dates2_.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ds = open_dataset(dataset, fill_missing_dates="closest")
1 change: 0 additions & 1 deletion docs/using/code/missing_dates_.py

This file was deleted.

1 change: 1 addition & 0 deletions docs/using/code/set_missing_dates_.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ds = open_dataset(dataset, set_missing_dates=["2010-01-01T12:00:00", "2010-02-01T12:00:00"])
23 changes: 21 additions & 2 deletions docs/using/missing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,25 @@
Managing missing dates
########################

**************************************************
Filling the missing dates with artificial values
**************************************************

When you have missing dates in a dataset, you can fill them with
artificial values. You can either fill them with values that are the
result of a linear interpolation between the two closest dates:

.. literalinclude:: code/fill_missing_dates1_.py

Or you can select the copy the value of the closest date:

.. literalinclude:: code/fill_missing_dates2_.py

if the missing date is exactly in the middle of two dates, the library
will choose that value of the largest date. You can change this behavior
by setting the ``closest`` parameter to ``'down'`` or ``'up'``
explicitly.

************************************************
Skipping missing when iterating over a dataset
************************************************
Expand Down Expand Up @@ -72,7 +91,7 @@ the datasets to make the dates contiguous.
Debugging
***********

You can set missing dates using the ``missing_dates`` option. This
You can set missing dates using the ``set_missing_dates`` option. This
option is for debugging purposes only.

.. literalinclude:: code/missing_dates_.py
.. literalinclude:: code/set_missing_dates_.py
33 changes: 5 additions & 28 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ dynamic = [
"version",
]
dependencies = [
"anemoi-utils[provenance]>=0.3.15",
"anemoi-utils[provenance]>=0.3.18",
"cfunits",
"numpy",
"pyyaml",
Expand All @@ -60,43 +60,20 @@ dependencies = [
]

optional-dependencies.all = [
"boto3",
"earthkit-data[mars]>=0.9",
"earthkit-geo>=0.2",
"earthkit-meteo",
"ecmwflibs>=0.6.3",
"entrypoints",
"gcsfs",
"kerchunk",
"pyproj",
"requests",
"anemoi-datasets[create,remote,xarray]",
]

optional-dependencies.create = [
"earthkit-data[mars]>=0.9",
"earthkit-data[mars]>=0.10.7",
"earthkit-geo>=0.2",
"earthkit-meteo",
"ecmwflibs>=0.6.3",
"eccodes>=2.38.1",
"entrypoints",
"pyproj",
]

optional-dependencies.dev = [
"boto3",
"earthkit-data[mars]>=0.9",
"earthkit-geo>=0.2",
"earthkit-meteo",
"ecmwflibs>=0.6.3",
"entrypoints",
"gcsfs",
"kerchunk",
"nbsphinx",
"pandoc",
"pyproj",
"pytest",
"requests",
"sphinx",
"sphinx-rtd-theme",
"anemoi-datasets[all,docs,tests]",
]

optional-dependencies.docs = [
Expand Down
13 changes: 8 additions & 5 deletions src/anemoi/datasets/create/functions/filters/rename.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,14 @@ def __init__(self, field, format):
self.format = format
self.bits = re.findall(r"{(\w+)}", format)

def metadata(self, key, **kwargs):
value = self.field.metadata(key, **kwargs)
if "{" + key + "}" in self.format:
bits = {b: self.field.metadata(b, **kwargs) for b in self.bits}
return self.format.format(**bits)
def metadata(self, *args, **kwargs):
value = self.field.metadata(*args, **kwargs)
if args:
assert len(args) == 1
key = args[0]
if "{" + key + "}" in self.format:
bits = {b: self.field.metadata(b, **kwargs) for b in self.bits}
return self.format.format(**bits)
return value

def __getattr__(self, name):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -370,12 +370,15 @@ def accumulations(context, dates, **request):

user_accumulation_period = request.pop("accumulation_period", 6)

# If `data_accumulation_period` is not set, this means that the accumulations are from the start
# of the forecast.

KWARGS = {
("od", "oper"): dict(patch=_scda),
("od", "elda"): dict(base_times=(6, 18)),
("ea", "oper"): dict(data_accumulation_period=1, base_times=(6, 18)),
("ea", "enda"): dict(data_accumulation_period=3, base_times=(6, 18)),
("rr", "oper"): dict(data_accumulation_period=3, base_times=(0, 3, 6, 9, 12, 15, 18, 21)),
("rr", "oper"): dict(base_times=(0, 3, 6, 9, 12, 15, 18, 21)),
("l5", "oper"): dict(data_accumulation_period=1, base_times=(0,)),
}

Expand Down
16 changes: 1 addition & 15 deletions src/anemoi/datasets/create/input/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,23 +6,9 @@
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction.
#
import datetime
import itertools

import logging
import math
import time
from collections import defaultdict
from copy import deepcopy
from functools import cached_property
from functools import wraps

import numpy as np
from anemoi.utils.dates import as_datetime as as_datetime
from anemoi.utils.dates import frequency_to_timedelta as frequency_to_timedelta

from anemoi.datasets.dates import DatesProvider as DatesProvider
from anemoi.datasets.fields import FieldArray as FieldArray
from anemoi.datasets.fields import NewValidDateTimeField as NewValidDateTimeField

from .trace import trace_select

Expand Down
38 changes: 35 additions & 3 deletions src/anemoi/datasets/create/input/result.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,38 @@
def _fields_metatata(variables, cube):
assert isinstance(variables, tuple), variables

def _merge(md1, md2):
assert set(md1.keys()) == set(md2.keys()), (set(md1.keys()), set(md2.keys()))
result = {}
for k in md1.keys():
v1 = md1[k]
v2 = md2[k]

if v1 == v2:
result[k] = v1
continue

if isinstance(v1, list):
assert v2 not in v1, (v1, v2)
result[k] = sorted(v1 + [v2])
continue

if isinstance(v2, list):
assert v1 not in v2, (v1, v2)
result[k] = sorted(v2 + [v1])
continue

result[k] = sorted([v1, v2])

return result

result = {}
for i, c in enumerate(cube.iterate_cubelets()):
assert c._coords_names[1] == variables[i], (c._coords_names[1], variables[i])
i = -1
for c in cube.iterate_cubelets():

if i == -1 or c._coords_names[1] != variables[i]:
i += 1

f = cube[c.coords]
md = f.metadata(namespace="mars")
if not md:
Expand All @@ -49,7 +78,10 @@ def _fields_metatata(variables, cube):
md["param"] = str(f.metadata("paramId", default="unknown"))
# assert md['param'] != 'unknown', (md, f.metadata('param'))

result[variables[i]] = md
if variables[i] in result:
result[variables[i]] = _merge(md, result[variables[i]])
else:
result[variables[i]] = md

assert i + 1 == len(variables), (i + 1, len(variables))
return result
Expand Down
1 change: 1 addition & 0 deletions src/anemoi/datasets/create/input/step.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ def select(self, group_of_dates):
)

def __repr__(self):
# raise NotImplementedError(f"Not implemented in {self.__class__.__name__}")
return super().__repr__(self.previous_step, _inline_=str(self.kwargs))


Expand Down
1 change: 1 addition & 0 deletions src/anemoi/datasets/data/concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ def concat_factory(args, kwargs):

datasets = kwargs.pop("concat")
fill_missing_gaps = kwargs.pop("fill_missing_gaps", False)

assert isinstance(datasets, (list, tuple))
assert len(args) == 0

Expand Down
Loading

0 comments on commit eb63372

Please sign in to comment.