Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ESACCI Cloud CMORizer (daily and monthly data) #3756

Draft
wants to merge 28 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
e2ebace
start new branch for esacci-cloud
LisaBock Jul 16, 2024
a0db7e3
change to desc
LisaBock Jul 17, 2024
53e1753
download am and pm
LisaBock Jul 22, 2024
67db280
merge am and pm
LisaBock Jul 22, 2024
1292c70
snapshot 23-7
LisaBock Jul 23, 2024
e956261
fix filename in config file
LisaBock Jul 24, 2024
560d04e
update recipe
LisaBock Jul 24, 2024
cc17140
update esacci_cloud for monthly data
diegokam Sep 24, 2024
8303b21
added monthly data download capability and restructured daily code do…
diegokam Dec 12, 2024
e32ea85
clean code for codacy
diegokam Dec 12, 2024
66d7f8e
clean code for codacy
diegokam Dec 12, 2024
5ae506c
Merge branch 'main' into update_esacci_cloud_monthly
diegokam Dec 12, 2024
5e9995c
fix formatter, add monthly attribute for variables in config file
diegokam Dec 13, 2024
1802b78
fix formatter codacy
diegokam Dec 13, 2024
a6facee
fix formatter, edit config file
diegokam Dec 16, 2024
7c8eed7
fix downloader for monthly and daily data
LisaBock Dec 17, 2024
614c1ef
fix cmorizer for monthly data
LisaBock Dec 18, 2024
a6628d5
set default years
LisaBock Dec 18, 2024
acde6de
fix downloyder
LisaBock Dec 19, 2024
3e0d6b4
create am-pm product
LisaBock Dec 19, 2024
fdaaf83
rm file
LisaBock Dec 19, 2024
b31d779
fill missing dates
LisaBock Dec 19, 2024
de9dc04
Merge remote-tracking branch 'public/main' into update_esacci_cloud_m…
LisaBock Dec 19, 2024
d45d02d
start clean formatter
LisaBock Dec 20, 2024
7c94f4a
automated codacy fixes
LisaBock Dec 23, 2024
70901cc
adding more variables for monthly data
diegokam Jan 10, 2025
917a734
update recipe_check_obs.yml
LisaBock Jan 15, 2025
5564497
Merge remote-tracking branch 'public/main' into update_esacci_cloud_m…
LisaBock Jan 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions esmvaltool/cmorizers/data/cmor_config/ESACCI-CLOUD.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# CMORIZE ESA CCI CLOUD daily/monthly data
---

# Common global attributes for Cmorizer output
attributes:
dataset_id: ESACCI-CLOUD
version: 'v3.0-AVHRR'
tier: 2
project_id: OBS6
source: 'ESA CCI'
modeling_realm: sat
reference: 'esacci_cloud'
comment: ''
start_year_monthly: 1982
end_year_monthly: 2016
start_year_daily: 2003
end_year_daily: 2007

# Variables to cmorize
variables:
# daily data
#clt_day:
# short_name: clt
# mip: day
# raw: [cmask_desc, cmask_asc]
# raw_units: '1'
# file: '-ESACCI-L3U_CLOUD-CLD_MASKTYPE-AVHRR_*-fv3.0.nc'
# clwvi:
# mip: CFday
# raw: [cwp_desc, cwp_asc]
# raw_units: g/m2
# file: '-ESACCI-L3U_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
# ctp:
# mip: day
# raw: [ctp_desc, ctp_asc]
# raw_units: hPa
# file: '-ESACCI-L3U_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
# reff:
# mip: day
# raw: [cer_desc, cer_asc]
# raw_units: um
# file: '-ESACCI-L3U_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'

# monthly data
clt_mon:
short_name: clt
mip: Amon
raw: cfc
raw_units: '1'
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
cltStderr_mon:
short_name: cltStderr
mip: Amon
raw: cfc_unc
raw_units: '1'
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
lwp_mon:
short_name: lwp
mip: Amon
raw: lwp_allsky
raw_units: g/m2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
clivi_mon:
short_name: clivi
mip: Amon
raw: iwp_allsky
raw_units: g/m2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
clwvi_mon:
short_name: clwvi
mip: Amon
raw: iwp_allsky
raw_units: g/m2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rlut_mon:
short_name: rlut
mip: Amon
raw: toa_lwup
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rlutcs_mon:
short_name: rlutcs
mip: Amon
raw: toa_lwup_clr
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rsut_mon:
short_name: rsut
mip: Amon
raw: toa_swup
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rsutcs_mon:
short_name: rsutcs
mip: Amon
raw: toa_swup_clr
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rsdt_mon:
short_name: rsdt
mip: Amon
raw: toa_swdn
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rlus_mon:
short_name: rlus
mip: Amon
raw: boa_lwup
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rsus_mon:
short_name: rsus
mip: Amon
raw: boa_swup
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
rsuscs_mon:
short_name: rsuscs
mip: Amon
raw: boa_swup_clr
raw_units: W m-2
file: '-ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_*-fv3.0.nc'
128 changes: 101 additions & 27 deletions esmvaltool/cmorizers/data/downloaders/datasets/esacci_cloud.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
"""Script to download ESACCI-CLOUD."""

"""Script to download daily and monthly ESACCI-CLOUD data."""
import logging
from datetime import datetime

from dateutil import relativedelta

from esmvaltool.cmorizers.data.downloaders.ftp import CCIDownloader
from esmvaltool.cmorizers.data.downloaders.wget import WGetDownloader

logger = logging.getLogger(__name__)


def download_dataset(config, dataset, dataset_info, start_date, end_date,
Expand All @@ -26,40 +28,112 @@ def download_dataset(config, dataset, dataset_info, start_date, end_date,
overwrite : bool
Overwrite already downloaded files
"""
defined_time = True
if start_date is None:
start_date = datetime(1982, 1, 1)
defined_time = False
if end_date is None:
end_date = datetime(2016, 1, 1)
end_date = datetime(2016, 12, 31)
loop_date = start_date

downloader = CCIDownloader(
downloader = WGetDownloader(
config=config,
dataset=dataset,
dataset_info=dataset_info,
overwrite=overwrite,
)
downloader.connect()
end_of_file = 'ESACCI-L3C_CLOUD-CLD_PRODUCTS-AVHRR_NOAA-12-fv3.0.nc'
filler_data = {
1994: [
f'AVHRR_NOAA_12/1994/199409-{end_of_file}',
f'AVHRR_NOAA_12/1994/199410-{end_of_file}',
f'AVHRR_NOAA_12/1994/199411-{end_of_file}',
f'AVHRR_NOAA_12/1994/199412-{end_of_file}',
],
1995: [
f'AVHRR_NOAA_12/1995/199501-{end_of_file}',
],
}

# Base paths for L3U (daily data) and L3C (monthly data)
base_path_l3u = ('https://public.satproj.klima.dwd.de/data/ESA_Cloud_CCI/'
'CLD_PRODUCTS/v3.0/L3U/')
base_path_l3c = ('https://public.satproj.klima.dwd.de/data/ESA_Cloud_CCI/'
'CLD_PRODUCTS/v3.0/L3C/')

wget_options = [
'-r',
'-e robots=off', # Ignore robots.txt
'--no-parent', # Don't ascend to the parent directory
'--reject="index.html"', # Reject any HTML files
]

while loop_date <= end_date:
year = loop_date.year
downloader.set_cwd('version3/L3C/AVHRR-PM/v3.0')
for folder in downloader.list_folders():
for year_folder in downloader.list_folders(folder):
if int(year_folder) == year:
downloader.download_year(f'{folder}/{year_folder}')
downloader.set_cwd('version3/L3C/AVHRR-AM/v3.0')
for extra_file in filler_data.get(year, []):
downloader.download_file(extra_file)
loop_date += relativedelta.relativedelta(years=1)
month = loop_date.month
date = f'{year}{month:02}'

if int(date) in range(198201, 198502):
sat_am = ''
sat_pm = 'AVHRR-PM/AVHRR_NOAA-7/'
elif int(date) in range(198502, 198811):
sat_am = ''
sat_pm = 'AVHRR-PM/AVHRR_NOAA-9/'
elif int(date) in range(198811, 199109):
sat_am = ''
sat_pm = 'AVHRR-PM/AVHRR_NOAA-11/'
elif int(date) in range(199109, 199409):
sat_am = 'AVHRR-AM/AVHRR_NOAA-12/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-11/'
elif int(date) in range(199409, 199502):
sat_am = 'AVHRR-AM/AVHRR_NOAA-12/'
sat_pm = ''
elif int(date) in range(199502, 199901):
sat_am = 'AVHRR-AM/AVHRR_NOAA-12/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-14/'
elif int(date) in range(199901, 200104):
sat_am = 'AVHRR-AM/AVHRR_NOAA-15/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-14/'
elif int(date) in range(200104, 200211):
sat_am = 'AVHRR-AM/AVHRR_NOAA-15/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-16/'
elif int(date) in range(200211, 200509):
sat_am = 'AVHRR-AM/AVHRR_NOAA-17/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-16/'
elif int(date) in range(200509, 200707):
sat_am = 'AVHRR-AM/AVHRR_NOAA-17/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-18/'
elif int(date) in range(200707, 200906):
sat_am = 'AVHRR-AM/AVHRR_METOPA/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-18/'
elif int(date) in range(200906, 201701):
sat_am = 'AVHRR-AM/AVHRR_METOPA/'
sat_pm = 'AVHRR-PM/AVHRR_NOAA-19/'
else:
logger.error("Data for this date %s is not available", date)

# Download monthly data from L3C
for sat in (sat_am, sat_pm):
if sat != '':
# monthly data
logger.info("Downloading monthly data (L3C) for sat = %s", sat)
folder_l3c = base_path_l3c + sat + f'{year}/'
wget_options_l3c = wget_options.copy()
wget_options_l3c.append(f'--accept={date}*.nc')
logger.info("Download folder for monthly data (L3C): %s",
folder_l3c)
try:
downloader.download_file(folder_l3c, wget_options_l3c)
except Exception as e:
logger.error("Failed to download monthly data from %s: %s",
folder_l3c, str(e))

# daily data
if defined_time or (not defined_time and
(year in range(2003, 2008))):
logger.info("Downloading daily data (L3U) for sat = %s",
sat)
folder_l3u = base_path_l3u + sat + f'{year}/{month:02}'
wget_options_l3u = wget_options.copy()
wget_options_l3u.append(
f'--accept={date}*CLD_MASKTYPE*.nc,'
f'{date}*CLD_PRODUCTS*.nc')
logger.info("Download folder for daily data (L3U): %s",
folder_l3u)
try:
downloader.download_file(folder_l3u, wget_options_l3u)
except Exception as e:
logger.error(
"Failed to download daily data from %s: %s",
folder_l3u, str(e))

# Increment the loop_date by one month
loop_date += relativedelta.relativedelta(months=1)
Loading