update posts

meteofrance · Oct 21, 2024 · 56dc138 · 56dc138
1 parent efd3a87
commit 56dc138
Show file tree

Hide file tree

Showing 2 changed files with 81 additions and 24 deletions.
diff --git a/_posts/2024-10-21-py4cast-titan-neural-network-comparison.md b/_posts/2024-10-21-py4cast-titan-neural-network-comparison.md
@@ -8,19 +8,16 @@ author:
 - Léa Berthomier
 ---
 
-# Experimental reports
 
-We present here a report on our experiments to build a weather forecasting model with Deep Learning.
+We present here a report on our first campaign of experiments to build a weather forecasting model with Deep Learning.
 
-## Choice of neural network architecture
+Our first goal was to train and compare several neural network architectures and assess their differences, all things being equal.
 
-Our first task was to train and compare several neural network architectures and assess their differences.
-
-Our goal in this first experiment campaign was to compare the performance of different neural network architectures under the same conditions.
+## Setup
 
 ### Dataset: TITAN
 
-- Source **AROME Analyses**
+- Source: only **AROME Analyses** (no data from Arpege coupling model)
 - **Resolution**: 2.5km
 - **Historical Data**: 2021-2023 (training: 2021-2022, testing: 2023)
 - **Time Step**: 1 Hour
@@ -36,6 +33,7 @@ Our goal in this first experiment campaign was to compare the performance of dif
 
 **Note**: Precipitation is the only parameter that is not an analysis. It is an AROME forecast made every hour, predicting the cumulative precipitation in mm for the next hour. In the future, we aim to use higher quality expertized radar data.
 
+
 ### Training Methodology
 
 - **Training strategy**: Trained to make 1-hour forecasts, in scaled steps: y_pred = x + f(x) * step_diff_std + step_diff_mean
@@ -51,21 +49,14 @@ Our goal in this first experiment campaign was to compare the performance of dif
 - **Inference Time**: Less than one minute on CPU for +12h forecasts
 
 
-### Results
+## Results
 
 Both in metrics and forecast case studies, the UNetR++ model, modified and optimized, proved to be the best among the tested models.
 
 The HiLAM model, on the other hand, was costly to train, with lower scores and lower resolution forecasts.
 
-Here we present some forecast animations of these two models, compared to AROME and AROME analyses (our "ground truth" here) on a case study for 5 surface parameters.
-
-On 2023/06/18 at 12h UTC, a warm and unstable southwesterly flow favored the development of strong thunderstorms over parts of France on June 18. The north of the country was swept by wind gusts between 90 and 110 km/h. Some supercells formed over the center and then the southwest of the country, sometimes generating heavy hail.
-
-**INSERT GIFS**
-
-**Tableau : temps training, commentaires (config + lien vers fichier), prévisions lisses, loss finale sur jeu de test, RAM utilisée, batch size**
+**Table : temps training, commentaires (config + lien vers fichier), prévisions lisses, loss finale sur jeu de test, RAM utilisée, batch size**
 
-**Ajouter citation dans readme principal**
 
 | Model  | Configuration | Training Time | Batch Size  | Used RAM | Final loss on test set | Comments |
 | :---:   | :---: | :---: | :---: | :---: | :---: | :---: |
@@ -77,6 +68,20 @@ On 2023/06/18 at 12h UTC, a warm and unstable southwesterly flow favored the dev
 | UNetR++ | AROME | EURW1S100 (1.3 km) | 10m |  U, V |
 
 
+### Scores
+
+TODO : add graph with RMSE per time step for a few surface and atmo parameters
+
+
+### Animations
+
+Here we present some forecast animations of these two models, compared to AROME and AROME analyses (our "ground truth" here) on a case study for 5 surface parameters.
+
+On 2023/06/18 at 12h UTC, a warm and unstable southwesterly flow favored the development of strong thunderstorms over parts of France on June 18. The north of the country was swept by wind gusts between 90 and 110 km/h. Some supercells formed over the center and then the southwest of the country, sometimes generating heavy hail.
+
+**INSERT GIFS**
+
+
 ## Analysis & Perspectives
 
 - **Initial experiments** show that it is possible to train a neural network model to provide forecasts at the scale of France.

diff --git a/_posts/2024-10-21-titan.md b/_posts/2024-10-21-titan.md
@@ -5,13 +5,20 @@ date: 2024-10-21
 categories: [Weather Forecasting, Machine Learning, Python, Dataset]
 author:
 - Léa Berthomier
-- Frank Guibert
 ---
 
 
-# Dataset: TITAN
+# TITAN : Training Inputs & Targets from Arome for Neural networks
 
-- Source **AROME Analysis**
+Titan is a dataset made to train an AI weather forecasting models on France.
+
+
+## Data
+
+* 2 data sources: Analyses and forecasts from AROME and ARPEGE NWP models
+* 1 hour timestep
+* Depth: 5 years
+* Format NPY
 - **Resolution**: 2.5km
 - **Historical Data**: 2021-2023 (training: 2021-2022, testing: 2023)
 - **Time Step**: 1 Hour
@@ -20,10 +27,55 @@ author:
   - **4 Variables at 4 Vertical Levels**:
     - 850, 700, 500, 250 hPa
     - T, U, V, Z
-- **4 Forcing Fields**: Model inputs
-  - Cosine and sine of the time of day
-  - Cosine and sine of the day of the year
-- **Training Samples**: ~16,000 pairs (t0, t+1)
-
 **Note**: Precipitation is the only parameter that is not an analysis. It is an AROME forecast made every hour, predicting the cumulative precipitation in mm for the next hour. In the future, we aim to use higher quality expertized radar data.
 
+
+## Download
+
+For now, 3 days of data stored on [HuggingFace](https://huggingface.co/datasets/meteofrance/titan)
+
+To download :
+
+```bash
+# Make sure you have git-lfs installed (https://git-lfs.com)
+git lfs install
+
+git clone https://huggingface.co/datasets/meteofrance/titan
+```
+
+
+## Details on available parameters
+
+Data is grouped in folders per hour. Each folder contains XX npy files, one per weather parameter and level.
+
+
+| File name   | Name                | Unit       | Source NWP Model | Levels |
+| :---:       | :---:               | :---:      | :---:            | :---:  |
+│ aro_t2m_2m  | 2 meter temperature │ K          │ Arome            | 2m       |
+│ aro_r2_2m   | 2 meter temperature │ %          │ Arome            | 2m       |
+│ aro_tp_0m   | 2 meter temperature │ kg m**-2   │ Arome            | 0m       |
+│ aro_u10_10m | 2 meter temperature │ m s**-1    │ Arome            | 10m       |
+│ aro_v10_10m | 2 meter temperature │ m s**-1    │ Arome            | 10m       |
+│ aro_t_XhPa  | 2 meter temperature │ K          │ Arome            | 250, 500, 700, 850 hPa |
+│ aro_u_XhPa  | 2 meter temperature │ m s**-1    │ Arome            | 250, 500, 700, 850 hPa |
+│ aro_v_XhPa  | 2 meter temperature │ m s**-1    │ Arome            | 250, 500, 700, 850 hPa |
+│ aro_z_XhPa  | 2 meter temperature │ m**2 s**-2 │ Arome            | 250, 500, 700, 850 hPa |
+│ aro_r_XhPa  | 2 meter temperature │ m**2 s**-2 │ Arome            | 250, 500, 700, 850 hPa |
+│ arp_t_XhPa  | 2 meter temperature │ K          │ Arpege           | 250, 500, 700, 850 hPa |
+│ arp_u_XhPa  | 2 meter temperature │ m s**-1    │ Arpege           | 250, 500, 700, 850 hPa |
+│ arp_v_XhPa  | 2 meter temperature │ m s**-1    │ Arpege           | 250, 500, 700, 850 hPa |
+│ arp_z_XhPa  | 2 meter temperature │ m**2 s**-2 │ Arpege           | 250, 500, 700, 850 hPa |
+│ arp_r_XhPa  | 2 meter temperature │ m**2 s**-2 │ Arpege           | 250, 500, 700, 850 hPa |
+
+
+## Storage
+
+* Size of **compressed** 5 years archive : ~ XX Go
+
+* Size of **uncompressed** 5 years archive : ~ XX Go
+
+## TODO
+
+* add aro humidity on all levels
+* put whole dataset on hugginface in npy format
+* make plot with all params