Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NWM 1km LDAS kerchunked dataset #226

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

rsignell-usgs
Copy link
Contributor

First try at NWM 1km LDAS recipe

@rsignell-usgs
Copy link
Contributor Author

rsignell-usgs commented Dec 1, 2022

@cisaacstern my recipe failed -- had a typo. I fixed it (pushed a change to my fork) but how do I kick off the checks again?

@cisaacstern
Copy link
Member

@rsignell-usgs, thanks for this contribution!

Apologies for the confusion. The UI has changed, and is now presented as part of the PR checks:

Screen Shot 2022-12-02 at 8 08 19 AM

A green check next to synchronize means your good to go with running a test, which is now done by calling /run {recipe_id} in a comment, where {recipe_id} is the name of your recipe given in meta.yaml. The /run command is currently only activated for members of the pangeo-forge GitHub org. I'll call a test on this recipe now.

@cisaacstern
Copy link
Member

/run NWM-2.1-grid1km-LDAS

@pangeo-forge
Copy link
Contributor

pangeo-forge bot commented Dec 2, 2022

🎉 The test run of NWM-2.1-grid1km-LDAS at 7aafd19 succeeded!

import xarray as xr

store = "https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr"
ds = xr.open_dataset(store, engine='zarr', chunks={})
ds

@cisaacstern
Copy link
Member

@rsignell-usgs
Copy link
Contributor Author

rsignell-usgs commented Dec 2, 2022

Hmm, the JSON contains only 2 time steps instead of the expected 2920 time steps.
https://ncsa.osn.xsede.org/Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr/refrence.json

@cisaacstern
Copy link
Member

Hmm, the JSON contains only 2 time steps instead of the expected 2920 time steps.

For the test on an unmerged PR, that is expected. We only run two time steps from /run, so that we can get a gist of the dataset without running the whole thing. The full temporal extent is run on PR merge.

Aside from the that, does the data look as expected? If so, I'll merge!

@rsignell-usgs
Copy link
Contributor Author

rsignell-usgs commented Dec 2, 2022

Yes, the data looks as expected except for the two time steps! We really want the entire simulation, not just 2017, so I just changed my file pattern to reflect that.

@rsignell-usgs
Copy link
Contributor Author

/run NWM-2.1-grid1km-LDAS

@cisaacstern
Copy link
Member

pre-commit.ci autofix

@cisaacstern
Copy link
Member

/run NWM-2.1-grid1km-LDAS

@rsignell-usgs
Copy link
Contributor Author

@cisaacstern , I didn't see any updated run here:

import fsspec
fs = fsspec.filesystem('s3', anon=True, skip_instance_cache=True, 
        client_kwargs={'endpoint_url': 'https://ncsa.osn.xsede.org'})
 fs.info('Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr/reference.json')

returns:

{'Key': 'Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr/reference.json',
 'LastModified': datetime.datetime(2022, 12, 2, 16, 22, 22, 505000, tzinfo=tzutc()),
 'ETag': '"f1934363a439ef7f4d8d8c05a6d6ebcc"',
 'Size': 195714,
 'StorageClass': 'STANDARD',
 'type': 'file',
 'size': 195714,
 'name': 'Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr/reference.json'}

@rsignell-usgs
Copy link
Contributor Author

rsignell-usgs commented Dec 8, 2022

Also just a note that reference.yaml intake catalog being generated is not quite right.
Instead of:

sources:
  data:
    args:
      chunks: {}
      consolidated: false
      storage_options:
        fo: Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr/reference.json
        remote_options:
          anon: true
        remote_protocol: s3
        skip_instance_cache: true
        target_options: {}
        target_protocol: s3
      urlpath: reference://
    description: ''
    driver: intake_xarray.xzarr.ZarrSource

It should be:

sources:
  data:
    args:
      chunks: {}
      urlpath: "reference://"
      consolidated: false
      storage_options:
        target_options:
          anon: true
          client_kwargs: {'endpoint_url': 'https://ncsa.osn.xsede.org'}
        fo: 's3://Pangeo/pangeo-forge/test/pangeo-forge/staged-recipes/recipe-run-1393/NWM-2.1-grid1km-LDAS.zarr/reference.json'
        remote_options:
          anon: true
        remote_protocol: "s3"
    driver: intake_xarray.xzarr.ZarrSource
    description: ''

@cisaacstern
Copy link
Member

@cisaacstern , I didn't see any updated run here:

Yes, thanks for the ping. Looks like something is failing silently before this job get's submitted. I'll take a look at the logs and get back to you on this.

Also just a note that reference.yaml intake catalog being generated is not quite right.

@rsignell-usgs could you make an issue or PR for this on https://github.com/pangeo-forge/pangeo-forge-recipes? Looks like this is the relevant code.

@sharkinsspatial
Copy link
Contributor

/run NWM-2.1-grid1km-LDAS

@rsignell-usgs
Copy link
Contributor Author

pre-commit.ci autofix

@rsignell-usgs
Copy link
Contributor Author

/run NWM-2.1-grid1km-LDAS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants