Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorials + Use Cases #44

Open
97 tasks
jejjohnson opened this issue May 15, 2023 · 0 comments
Open
97 tasks

Tutorials + Use Cases #44

jejjohnson opened this issue May 15, 2023 · 0 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Milestone

Comments

@jejjohnson
Copy link
Owner

jejjohnson commented May 15, 2023

Here is a mega-list of use cases and tutorials we can do to populate the JupyterBook. We can break them down into specific sections. We can break these tutorials down into 4 levels of granularity:

  1. Landing Page
  2. High-Level Usage
  3. Low-Level Usage
  4. Contributing

Level 0 - Landing Page

This is where the users will enter into the OceanBench package. We will:

  1. Describe what is OceanBench
  2. Demonstrate how can one install it
  3. Describe how does one get access to the data, 4) showcase and how
  • What is OceanBench? (Diagram, few paragraphs from paper)
  • How to Install
  • How to download the data, little bit of dvc fluency
  • Getting Started
    • Task -> ML
    • Torch Integration of XRPatcher
    • Map -> Leaderboard
  • Showcase 4 Tasks & 4 LeaderBoards
  • Upgrading your OceanBench Fluency

Level I - High-Level Usage

In this section of tutorials, we look at some "out-of-the-box" solutions that OceanBench provides. This can be useful for people interested in piping preprocessed data for

Tutorials

  • Result (Map) -> Leaderboard
  • Task 2 XRPatcher

Bonus Tutorials

  • Reconstruct Sea Surface Currents from Altimetry Tracks
  • Forecasting
  • Learning a Latent Space
  • Super Resolution, Downscaling, Upsampling
  • Denoising

Use Cases

These are self-contained, reproducible use cases which use these out-of-the-box solutions which demonstrate how we can do SSH reconstructions.

  • Optimal Interpolation Example w/ AlongTrack Data
  • 4DVarNet Starter Example w/ Gridded Data
  • Neural Field Example w/ AlongTrack Data

Level II - Low-Level Usage

  1. Creating Your Own - Hydra Fluency
  • Simple Custom Pipeline - preprocessing, postprocessing, plotting
  1. Using preconfigured elements of OceanBench
  • fetching, reusing
  • Load the formatted data
    • AlongTrack
    • Gridded Data
  • Load my result with specific postprocessing - reference grid, resampling, evaluation period, domain, coordinate transform
  • Create an Evaluation dataset
  • Evaluation Maps
  • Plotting Maps
  • Computing Metrics - PSD Score, nRMSE
  • Apply Plotting Configuration - Real Space
  • Spectral Space

Modifying Existing

  • Create a new Task
    • New Data - changing inputs and/or outputs
    • Changing the domain - Med
    • Changing the evaluation Period - summer
  • Create a new metric - e.g. RMSE, (field, field -> scalar)
  • adding an additional processing step
    • Preprocessing - e.g. different target grid for XRPatcher
    • Plotting - e.g. filtering

Use Cases

  • Spe

Level III - Contribution

How do you contribute? How do you add resuable blocks that are resuable by everyone? (package Conventions)

  • Add a new metric, e.g. wavelet decomposition integration
  • Add a new graphic, e.g. fancy strain histogram plot
  • Add drifter data (new data format)
  • New XRPatcher for different, e.g. SWOT Calibration
  • Global Custom Metrics, e.g. scale-wise, region-specific/dependent (OSE Global DC)


Pre-Processing

Everything related to accessing available datasets

  • Quick dvc tutorial, accessing all of the available datasets (summary table of available ones), downloading, quick plot .

XRPatcher Integration

Some tutorials for how to use XRPatcher and how to integrate it into some other frameworks.

Generic Formulations

  • 1D time series (time) + weighted reconstruction
  • 2D patches (lat, lon) + weighted reconstruction
  • 3D Field (lat, lon, depth/height)

PyTorch Integration

  • Simple Usage - Gridded
  • Simple Usage - AlongTrack + Concat Multiple Datasets

Hydra Usage

Full Walkthrough

Go through each part step by step

  • Pipe
  • Cross Referencing
  • Overriding
  • Scripts

0-100 with Hydra

This is how to extend OceanBench with new stuff.

  • Changing something in the preprocessing (e.g. coordinate-based)
  • Adding a new metric
  • Adding a new plot
  • Adding a new method
  • Adding a new task

Preprocessing/GeoProcessing Pipeline

  • Open 1 file + Do 1 pipeline (xr.open_dataset)
  • Open Multiple files + Do 1 pipeline (xr.open_mfdataset, preprocess=)
  • Multiple files + Multiple Pipelines (xr.open_mfdataset, preprocess=)

Evaluation Pipeline

  • Metrics - reference file
  • Metrics - reference + comparison file

Visualization

  • Plots - Single Variable (SSH map)
  • Plots - Multiple Variables (PSD)

LeaderBoard

  • Data Challenge 2020a - Gulfstream - OSSE - NEMO
  • Data Challenge 2021a - Gulfstream - OSE - Altimetry

Evaluation Metrics

Some standard statistics that one typically uses when evaluating fields for SSH. We typically have two types: 1) gridded and along track. We also need some standard preprocessing steps for some of the metrics

  • Regridded
  • LatLon degrees -> meters
  • Time -> days (or wtv unit)

Gridded

  • Skill Scores
  • Power Spectrum + Score (Isotropic)
  • Power Spectrum + Score (SpaceTime)

AlongTrack

  • Skill Scores
  • Power Spectrum + Score

Visualizations

Showcase some staple visualisations for evaluations

  • Using Hydra for Custom Visualisations
    • Temporal Statistics (e.g. nrmse)
    • Maps, e.g. SSA, SLA, Kinetic Energy
    • Power Spectrum (Isotropic)
    • Power Spectrum Score (Isotropic)
  • Temporal Statistics (nRMSE, Energy, Enstrophy)
  • Derived Variables
    • SSH, SLA, u, v, kinetic energy, relative vorticity, absolute vorticity, enstrophy, strain, Okubo-Weiss
  • Power Spectrum + Score (Isotropic)
  • Power Spectrum + Score (SpaceTime)
  • Animated Maps (gifs + videos - movie)
  • Animated Maps (hvplot, geoviews)

Data Challenges

A quick rundown and specific details related to the data challenges. It should include:

  • Problem settings and experimental setup
  • How to access the data via the registry
  • A simple preprocessing routine (alongtrack, grid)
  • An evaluation script for a demo dataset (DUACS, 4DVarNet)
  • 2020a - Gulfstream - OSSE - NEMO
  • 2021a - Gulfstream - OSE
  • 2022a - GulfStream - OSSE - QG
  • 2023a - Mediterranean - OSE
  • 2023b - Global - OSE

Machine Learning Demos

Some start to finish examples of how OceanBench can be used. Some example applications include

  • Sea Surface Height Interpolation
  • Forecasting
  • Dimension Reduction / Reconstruction
  • Surrogate Modeling

SSH Interpolation

Some simple demonstrations from the data challenge

@jejjohnson jejjohnson changed the title Use Cases Tutorials + Use Cases May 24, 2023
@jejjohnson jejjohnson added documentation Improvements or additions to documentation enhancement New feature or request labels May 24, 2023
@jejjohnson jejjohnson added this to the Viz milestone May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants