Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Validation Charts #210

Merged
merged 16 commits into from
Oct 25, 2024
Merged

Add Validation Charts #210

merged 16 commits into from
Oct 25, 2024

Conversation

trevorb1
Copy link
Member

@trevorb1 trevorb1 commented Oct 14, 2024

Description

In this PR I have added some basic validation charts to compare model results against historical years. For any year between 2015 and 2022 (inclusive), power sector generation, emission, and capacity validation charts will be generated. These charts show modelled results against reported results at a country level. Below are the sources I used:

Capacity Validation:

Generation Validation:

Emission Validation:

A couple notes:

  • The ember dataset is the same one that is used for demand projections
  • I needed ISO names for IRENA processing, but couldnt find anywhere in the repo where we have this. I grabbed the data from here, but if I just missed seeing existing data that has this, I will switch!
  • Each dataset is compared separately, as aligning data is a little difficult. For example, EIA will aggregate fossil fuel capacity together, while IRENA will break out gas, oil, coal capacity separately. There are a couple other instances of this, so it was just easier to compare everything separately.
  • On the comparisons, I manually align OG names to the dataset names. Then I filter out any datapoints that are not present in the OG dataset. For example, a few datasets will include pumped hydro storage, which is not in OG - so the validation charts exclude showing pumped hydropower. It may be better to reverse this logic so we can still capture capacity/generation we have yet to include.
  • Ember also gives carbon emission intensity, which would be nice to complement the current total yearly emission values. This requires post-processing a new CarbonIntensity metric though, which is indexed (probably) over REGION, EMISSION, YEAR. This can be done in a separate issue ticket, though.
  • ClimateWatch does not have emission values for quite a few countries in 2022, so this validation normally stops at 2021.

Issue Ticket Number

na

Documentation

na

@maartenbrinkerink
Copy link
Collaborator

Hi @trevorb1 just did a quick first look and below are my initial comments. Currently running the pr with the Americas to see if it runs for different countries as well. I'll do a more thorough review either late this week or early next week.

  • Hardcoded years. Option to check latest available years and adjust automatically? Or are all input files manually downloaded?

  • ISO; they do partially exist in different formats in different certain spatial mapping files (e.g. weo_region_mapping, GTD_region_mapping, GEM_region_mapping). Does it make sense to create one overarching spatial mapping file?

  • Partly related to the above, having tech mapping files in every individual script is understandable but it does increase the amount of hidden config options. We also have a bunch of tech mapping files as csv's (e.g. weo_powerplant_costs, naming_convention_tech) and lately have been making the constants.py files that also includes mapping files (e.g. for the powerplant and transmission scripts). Just wondering if we can streamline some of this? Not sure what the best approach would be, perhaps worth discussing on Monday.

@trevorb1
Copy link
Member Author

Hardcoded years. Option to check latest available years and adjust automatically? Or are all input files manually downloaded?

Good call! Updated in this commit. Some sources will need to be manually re-downloaded (as I couldnt find a persistent link) - but we wont have to update the code to match

Does it make sense to create one overarching spatial mapping file?

I think this comes back to the discussion in PR #203 where we create some sort of configuration file that contains data like:

Canada:
  iso: CAN
  region: NA # North America, not 'not available' 
  nodes: 
    AR:
      nice_name: atlantic_region
      centre_point: xxx, yyy
      gem_name: sample
      gtd_name: sample
    BC: 
      nice_name: british_columbia
      centre_point: xxx, yyy
      gem_name: sample
      gtd_name: sample
    ...

We also have a bunch of tech mapping files as csv's (e.g. weo_powerplant_costs, naming_convention_tech) and lately have been making the constants.py files that also includes mapping files (e.g. for the powerplant and transmission scripts). Just wondering if we can streamline some of this?

I agree! Similar to above, creating some sort of configuration (that shouldnt really be changed too often) such as:

technology:
  bio:
    fuel_in: bio
    fuel_out: elec
    nice_name: biomass
    renewable: True
    color: darkgreen
    lifetime: 30
    name_maps: 
      plexos: bio
      gem: biomass
      weo: bio
...
fuel:
  bio:
    nice_name: biomass
    renewable: True
    color: darkgreen
    name_maps: 
      epa: biomass
...

Not saying these are the best structures or anything, but passing around these configs (which can easily be parsed with dataclasses) can replace numerous other files. Moreover, we could use similar structures for user defined capacity to simplify that process.

@trevorb1
Copy link
Member Author

Ember also gives carbon emission intensity, which would be nice to complement the current total yearly emission values. This requires post-processing a new CarbonIntensity metric though, which is indexed (probably) over REGION, EMISSION, YEAR. This can be done in a separate issue ticket, though.

I ended up adding this in this PR. There is now also a validation plot called emission_intensity

@maartenbrinkerink
Copy link
Collaborator

Hi @trevorb1. The validation looks good to me (and functional for other countries), small suggestions (non-essential) below. Feel free to merge the PR!

  1. Suggest to filter out the year graphs for any combination of validation/year where the external data is not available. E.g. the climatewatch/emissions validation and often the IRENA/generation validation (possibly others).
  2. Small fix to prevent the below depreciation warning.
    image

Really need to get better capacity data for SPV.... Yikes! (I guess that's why we do the validation....)

@trevorb1
Copy link
Member Author

@maartenbrinkerink thanks, and both good points! Il address them in the coming days!

One question I just thought of with the AnnualEmissionIntensity results that are calculated. I convert intensity to gco2/kwh instead of leaving in base units of MT/PJ. My motivation is that nobody (at least that I have seen) has ever looked at emissions in MT/PJ, but this may be confusing since we are reporting results not following base units.

Do you think I should move the conversion calculation (see here) to the validation script, and leave reported AnnualEmissionIntensity in MT/PJ? Or is it fine as is?

@maartenbrinkerink
Copy link
Collaborator

I find gco2/kwh a lot more intuitive so I suggest to leave it as it is.

@trevorb1
Copy link
Member Author

The removal of years we dont have data for ended up being a little awkward, as some sources only report data for some countries for certain years (ie, eia has 2023 USA data, but not 2023 India). Then how they tag missing data isnt super clear to me. Its mostly updated though - with the exception of EIA capacity and IRENA generation. Im gonna merge as this is diminishing returns to continue to work on this right now!

@trevorb1 trevorb1 merged commit 4f29ad6 into master Oct 25, 2024
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants