Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

review units in AI data #197

Closed
cjyetman opened this issue Mar 25, 2024 · 3 comments
Closed

review units in AI data #197

cjyetman opened this issue Mar 25, 2024 · 3 comments

Comments

@cjyetman
Copy link
Member

cjyetman commented Mar 25, 2024

#> Error in base::tryCatch(base::withCallingHandlers({: 4 assertions failed:
#>  * Variable 'data$technology': must contain only valid technology
#>  * names, but has additional element "ICE Hydrogen_HDV".
#>  * Variable 'data$ald_production_unit': must contain only valid
#>  * production units, but has additional elements "# vehicles", "dwt
#>  * km", "t cement", "t coal", and "t steel".
#>  * Variable 'data$ald_emissions_factor_unit': must contain only valid
#>  * emissions factor units, but has additional elements "tCO2/dwt km",
#>  * "tCO2/km", "tCO2/pkm", "tCO2/tkm", "tCO2e/GJ", "tCO2e/MWh", "tCO2e/t
#>  * cement", "tCO2e/t coal", and "tCO2e/t steel".
#>  * Variable 'data$technology': must contain only valid technology names
#>  * for HDV, but has additional elements %s.

This part is pretty important. Could be coming from two places, either:

  • AI decided to update the values it uses for emission factors and what not (e.g. tCO2/ km -> tCO2/km)
  • AI is providing certain new data points in the output that are not even expected (e.g. tCO2/dwt km)
  • (or both?)

I guess we have a decision to make regarding:

  • Updating pacta.scenario.preparation and pacta.data.validation to expect what is here
  • Updating masterdata_debt to match units expected by pacta.scenario.preparation and pacta.data.validation

Of the two, I guess I would probably prefer the former, which will require a PR to:
https://github.com/RMI-PACTA/pacta.scenario.preparation
But happy to do either, thoughts @cjyetman @AlexAxthelm ?

Hopefully, we just need to adjust the strings themselves, and don't need to change the actual data/ assumptions at all.

Originally posted by @jdhoffa in #185 (comment)

root_dir <- "~/data/workflow-data-preparation-outputs/2023Q4_20240303T082642Z"

pacta.data.validation::validate_masterdata_debt_datastore(
  readRDS(file.path(root_dir, "masterdata_debt_datastore.rds"))
)
#> Error in base::tryCatch(base::withCallingHandlers({: 4 assertions failed:
#>  * Variable 'data$technology': must contain only valid technology
#>  * names, but has additional element "ICE Hydrogen_HDV".
#>  * Variable 'data$ald_production_unit': must contain only valid
#>  * production units, but has additional elements "# vehicles", "dwt
#>  * km", "t cement", "t coal", and "t steel".
#>  * Variable 'data$ald_emissions_factor_unit': must contain only valid
#>  * emissions factor units, but has additional elements "tCO2/dwt km",
#>  * "tCO2/km", "tCO2/pkm", "tCO2/tkm", "tCO2e/GJ", "tCO2e/MWh", "tCO2e/t
#>  * cement", "tCO2e/t coal", and "tCO2e/t steel".
#>  * Variable 'data$technology': must contain only valid technology names
#>  * for HDV, but has additional elements %s.

pacta.data.validation::validate_masterdata_ownership_datastore(
  readRDS(file.path(root_dir, "masterdata_ownership_datastore.rds"))
)
#> Error in base::tryCatch(base::withCallingHandlers({: 4 assertions failed:
#>  * Variable 'data$technology': must contain only valid technology
#>  * names, but has additional element "ICE Hydrogen_HDV".
#>  * Variable 'data$ald_production_unit': must contain only valid
#>  * production units, but has additional elements "# vehicles", "dwt
#>  * km", "t cement", "t coal", and "t steel".
#>  * Variable 'data$ald_emissions_factor_unit': must contain only valid
#>  * emissions factor units, but has additional elements "tCO2/dwt km",
#>  * "tCO2/km", "tCO2/pkm", "tCO2/tkm", "tCO2e/GJ", "tCO2e/MWh", "tCO2e/t
#>  * cement", "tCO2e/t coal", and "tCO2e/t steel".
#>  * Variable 'data$technology': must contain only valid technology names
#>  * for HDV, but has additional elements %s.

AB#10380

@cjyetman
Copy link
Member Author

@jdhoffa
Copy link
Member

jdhoffa commented Apr 12, 2024

Before closing this, we should wait until:

  • we do a fresh re-run scenarios using workflow.scenario.preparation
  • and a fresh run of workflow.data.preparation

to ensure that the masterdata_* output now has acceptable units.

But seems like we're probably close!

cjyetman added a commit that referenced this issue Apr 18, 2024
closes #18

Most validation errors originally found (below) have been resolved.
Validation of `financial_data` and `abcd_flags_equity` has been removed
for now and will be added in the future.

investigation issues:
- #196
- #197
- #198
- #198

relevant fixes in pacta.data.validation:
- RMI-PACTA/pacta.data.validation#65
- RMI-PACTA/pacta.data.validation#66
- RMI-PACTA/pacta.data.validation#67
- RMI-PACTA/pacta.data.validation#68

relevant fix in pacta.data.preparation
- RMI-PACTA/pacta.data.preparation#18

validation of `financial_data` and `abcd_flags_equity` has been removed
from this PR, and future intended implementation is tracked here
- #222
- dependent on
RMI-PACTA/pacta.data.validation#69
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants