Skip to content

Commit

Permalink
update posts
Browse files Browse the repository at this point in the history
  • Loading branch information
LBerth committed Oct 21, 2024
1 parent efd3a87 commit 56dc138
Show file tree
Hide file tree
Showing 2 changed files with 81 additions and 24 deletions.
37 changes: 21 additions & 16 deletions _posts/2024-10-21-py4cast-titan-neural-network-comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,16 @@ author:
- Léa Berthomier
---

# Experimental reports

We present here a report on our experiments to build a weather forecasting model with Deep Learning.
We present here a report on our first campaign of experiments to build a weather forecasting model with Deep Learning.

## Choice of neural network architecture
Our first goal was to train and compare several neural network architectures and assess their differences, all things being equal.

Our first task was to train and compare several neural network architectures and assess their differences.

Our goal in this first experiment campaign was to compare the performance of different neural network architectures under the same conditions.
## Setup

### Dataset: TITAN

- Source **AROME Analyses**
- Source: only **AROME Analyses** (no data from Arpege coupling model)
- **Resolution**: 2.5km
- **Historical Data**: 2021-2023 (training: 2021-2022, testing: 2023)
- **Time Step**: 1 Hour
Expand All @@ -36,6 +33,7 @@ Our goal in this first experiment campaign was to compare the performance of dif

**Note**: Precipitation is the only parameter that is not an analysis. It is an AROME forecast made every hour, predicting the cumulative precipitation in mm for the next hour. In the future, we aim to use higher quality expertized radar data.


### Training Methodology

- **Training strategy**: Trained to make 1-hour forecasts, in scaled steps: y_pred = x + f(x) * step_diff_std + step_diff_mean
Expand All @@ -51,21 +49,14 @@ Our goal in this first experiment campaign was to compare the performance of dif
- **Inference Time**: Less than one minute on CPU for +12h forecasts


### Results
## Results

Both in metrics and forecast case studies, the UNetR++ model, modified and optimized, proved to be the best among the tested models.

The HiLAM model, on the other hand, was costly to train, with lower scores and lower resolution forecasts.

Here we present some forecast animations of these two models, compared to AROME and AROME analyses (our "ground truth" here) on a case study for 5 surface parameters.

On 2023/06/18 at 12h UTC, a warm and unstable southwesterly flow favored the development of strong thunderstorms over parts of France on June 18. The north of the country was swept by wind gusts between 90 and 110 km/h. Some supercells formed over the center and then the southwest of the country, sometimes generating heavy hail.

**INSERT GIFS**

**Tableau : temps training, commentaires (config + lien vers fichier), prévisions lisses, loss finale sur jeu de test, RAM utilisée, batch size**
**Table : temps training, commentaires (config + lien vers fichier), prévisions lisses, loss finale sur jeu de test, RAM utilisée, batch size**

**Ajouter citation dans readme principal**

| Model | Configuration | Training Time | Batch Size | Used RAM | Final loss on test set | Comments |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
Expand All @@ -77,6 +68,20 @@ On 2023/06/18 at 12h UTC, a warm and unstable southwesterly flow favored the dev
| UNetR++ | AROME | EURW1S100 (1.3 km) | 10m | U, V |


### Scores

TODO : add graph with RMSE per time step for a few surface and atmo parameters


### Animations

Here we present some forecast animations of these two models, compared to AROME and AROME analyses (our "ground truth" here) on a case study for 5 surface parameters.

On 2023/06/18 at 12h UTC, a warm and unstable southwesterly flow favored the development of strong thunderstorms over parts of France on June 18. The north of the country was swept by wind gusts between 90 and 110 km/h. Some supercells formed over the center and then the southwest of the country, sometimes generating heavy hail.

**INSERT GIFS**


## Analysis & Perspectives

- **Initial experiments** show that it is possible to train a neural network model to provide forecasts at the scale of France.
Expand Down
68 changes: 60 additions & 8 deletions _posts/2024-10-21-titan.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,20 @@ date: 2024-10-21
categories: [Weather Forecasting, Machine Learning, Python, Dataset]
author:
- Léa Berthomier
- Frank Guibert
---


# Dataset: TITAN
# TITAN : Training Inputs & Targets from Arome for Neural networks

- Source **AROME Analysis**
Titan is a dataset made to train an AI weather forecasting models on France.


## Data

* 2 data sources: Analyses and forecasts from AROME and ARPEGE NWP models
* 1 hour timestep
* Depth: 5 years
* Format NPY
- **Resolution**: 2.5km
- **Historical Data**: 2021-2023 (training: 2021-2022, testing: 2023)
- **Time Step**: 1 Hour
Expand All @@ -20,10 +27,55 @@ author:
- **4 Variables at 4 Vertical Levels**:
- 850, 700, 500, 250 hPa
- T, U, V, Z
- **4 Forcing Fields**: Model inputs
- Cosine and sine of the time of day
- Cosine and sine of the day of the year
- **Training Samples**: ~16,000 pairs (t0, t+1)

**Note**: Precipitation is the only parameter that is not an analysis. It is an AROME forecast made every hour, predicting the cumulative precipitation in mm for the next hour. In the future, we aim to use higher quality expertized radar data.


## Download

For now, 3 days of data stored on [HuggingFace](https://huggingface.co/datasets/meteofrance/titan)

To download :

```bash
# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install

git clone https://huggingface.co/datasets/meteofrance/titan
```


## Details on available parameters

Data is grouped in folders per hour. Each folder contains XX npy files, one per weather parameter and level.


| File name | Name | Unit | Source NWP Model | Levels |
| :---: | :---: | :---: | :---: | :---: |
│ aro_t2m_2m | 2 meter temperature │ K │ Arome | 2m |
│ aro_r2_2m | 2 meter temperature │ % │ Arome | 2m |
│ aro_tp_0m | 2 meter temperature │ kg m**-2 │ Arome | 0m |
│ aro_u10_10m | 2 meter temperature │ m s**-1 │ Arome | 10m |
│ aro_v10_10m | 2 meter temperature │ m s**-1 │ Arome | 10m |
│ aro_t_XhPa | 2 meter temperature │ K │ Arome | 250, 500, 700, 850 hPa |
│ aro_u_XhPa | 2 meter temperature │ m s**-1 │ Arome | 250, 500, 700, 850 hPa |
│ aro_v_XhPa | 2 meter temperature │ m s**-1 │ Arome | 250, 500, 700, 850 hPa |
│ aro_z_XhPa | 2 meter temperature │ m**2 s**-2 │ Arome | 250, 500, 700, 850 hPa |
│ aro_r_XhPa | 2 meter temperature │ m**2 s**-2 │ Arome | 250, 500, 700, 850 hPa |
│ arp_t_XhPa | 2 meter temperature │ K │ Arpege | 250, 500, 700, 850 hPa |
│ arp_u_XhPa | 2 meter temperature │ m s**-1 │ Arpege | 250, 500, 700, 850 hPa |
│ arp_v_XhPa | 2 meter temperature │ m s**-1 │ Arpege | 250, 500, 700, 850 hPa |
│ arp_z_XhPa | 2 meter temperature │ m**2 s**-2 │ Arpege | 250, 500, 700, 850 hPa |
│ arp_r_XhPa | 2 meter temperature │ m**2 s**-2 │ Arpege | 250, 500, 700, 850 hPa |


## Storage

* Size of **compressed** 5 years archive : ~ XX Go

* Size of **uncompressed** 5 years archive : ~ XX Go

## TODO

* add aro humidity on all levels
* put whole dataset on hugginface in npy format
* make plot with all params

0 comments on commit 56dc138

Please sign in to comment.