Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move output directory creation to after pre-flight activities #182

Open
AlexAxthelm opened this issue Mar 1, 2024 · 2 comments
Open

Move output directory creation to after pre-flight activities #182

AlexAxthelm opened this issue Mar 1, 2024 · 2 comments

Comments

@AlexAxthelm
Copy link
Collaborator

AlexAxthelm commented Mar 1, 2024

Currently, the output directory creation step occuers near the beginning of the process. importantly, prior to the pre-flight checks and data scraping, meaning that if any of those fail, the process will create an empty directory and then error out, leaving it empty.

Suggest moving output directory creation to later in the process, just prior to when we start writing files.

if (dir.exists(config[["data_prep_outputs_path"]])) {
logger::log_warn("POTENTIAL DATA LOSS: Output directory already exists, and files may be overwritten ({config[[\"data_prep_outputs_path\"]]}).")
warning("Output directory exists. Files may be overwritten.")
} else {
logger::log_trace("Creating output directory: \"{config[[\"data_prep_outputs_path\"]]}\"")
dir.create(config[["data_prep_outputs_path"]], recursive = TRUE)
}

logger::log_info("Fetching pre-flight data.")
if (config[["update_currencies"]]) {
logger::log_info("Fetching currency data.")
input_filepaths <- c(
input_filepaths,
currencies_preflight_data_path = currencies_preflight_data_path
)
currencies <- pacta.data.scraping::get_currency_exchange_rates(
quarter = config[["imf_quarter_timestamp"]]
)
saveRDS(currencies, currencies_preflight_data_path)
} else {
logger::log_info("Using pre-existing currency data.")
# This requires the preflight path to be defined in the config
currencies <- readRDS(currencies_preflight_data_path)
}
logger::log_info("Scraping index regions.")
input_filepaths <- c(
input_filepaths,
index_regions_preflight_data_path = index_regions_preflight_data_path
)
index_regions <- pacta.data.scraping::get_index_regions()
saveRDS(index_regions, index_regions_preflight_data_path)
logger::log_info("Fetching pre-flight data done.")

AB#10858

@cjyetman
Copy link
Member

cjyetman commented Mar 1, 2024

I could add on on.exit() hook to remove the directory if it doesn't contain any files (or on any condition), which should run even if there's an error.

@AlexAxthelm
Copy link
Collaborator Author

hmm. that might be a simple solution.

@cjyetman cjyetman added the next label Apr 17, 2024
AlexAxthelm added a commit that referenced this issue Apr 18, 2024
Move directory creation to just prior to writing the first file to
output directory, so that it is not created unnecessarily.

Closes: #182
@jdhoffa jdhoffa added the ADO label Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants