Develop project vault iceberg #89

MichaelTiemannOSC · 2022-05-05T13:19:22Z

This PR augments the existing ITR ptorotype by adding a connection to the Data Commons. It preserves/incorporates the Pint Units enhancements recently added to the develop branch. To test, this PR requires access to credentials to certain GitHub user identities. We really need to sort a better way to integrate credential management with CI/CD testing.

Happy to answer any questions...

MichaelTiemannOSC · 2022-05-05T21:02:06Z

I can run the tests as me with my dotenv permissions, but the data vault requires special handling (as mentioned above). Help please ;-)

ImportError: Failed to import test module: test_vault_providers
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.12/x64/lib/python3.9/unittest/loader.py", line 436, in _find_test_path
    module = self._get_module_from_name(name)
  File "/opt/hostedtoolcache/Python/3.9.12/x64/lib/python3.9/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/home/runner/work/ITR/ITR/test/test_vault_providers.py", line 13, in <module>
    from ITR.data.vault_providers import VaultCompanyDataProvider, VaultProviderProductionBenchmark, \
  File "/home/runner/work/ITR/ITR/ITR/data/vault_providers.py", line 3, in <module>
    from dotenv import load_dotenv
ModuleNotFoundError: No module named 'dotenv'

Many companies report S1-only targets. We should handle that, and translate to S1S2 according to methodology. Signed-off-by: Michael Tiemann <[email protected]>

Added mentions for pint and pint-pandas, as well as latest pandas. Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…ctor Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Added openscm-units Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Provide target projections based on S1 and S2 scopes, not only S1S2. Also fix base year data exclusion bug (we don't have to abandon projection if base year == last_year). Also, simplify input data as we have not yet implemented everything described in #32 Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: Michael Tiemann <[email protected]>

Fix more data errors...now somewhat demoable (but strange results when target values are well above existing attainment, such as Duke Energy). Signed-off-by: Michael Tiemann <[email protected]>

Update openpyxl version number. Signed-off-by: Michael Tiemann <[email protected]>

The current benchmark data treats Asia as "Global" but that doesn't mean we cannot properly list Asia as a distinct region for display and aggregation purposes. Accordingly, change POSCO's region to Asia. Signed-off-by: Michael Tiemann <[email protected]>

Convert to base units before calculating magnitude (need to check elsewhere for this error!) and clamp CAGR to non-positive result. Signed-off-by: Michael Tiemann <[email protected]>

There was a bug in how EITargetProjector::project_ei_targets was projecting target data to 2050 absent specific targets with that as an explicit target year. These changes fix that bug, as well as enabling the functionality of using the netzero_year field of the input template. Also update template to use netzero_year instead of netzero_date. Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: Michael Tiemann <[email protected]>

…nups) Other cleanups include: Let DataWarehouse call _calculate_target_projections so user doesn't have to worry about it. Fix more spellings of emission_ to emissions_ When creating the one-row company_sector_region_info DataFrame, don't just initialize with singleton elements; put those elements into lists (so we can pass a Quantity as an ExtensionArray instead of being seen as a dict Comment highly suspect declaration of projected_targets, which are available in the base class of ICompanyAggregates Signed-off-by: Michael Tiemann <[email protected]>

Connected to previous checkin: construct company_sector_region_info DataFrame using dictionary of [] not singleton elements to make Quantity work as ArrayExtension instead of dict. Signed-off-by: Michael Tiemann <[email protected]>

Modify the original GUI app to work with new unitized ITR backend: * Added unitized JSON files * Use new initialization procedures for data Template * Unitize quantities within the GUI, such as specific temperature score values. Not fully working: a graph of production output wrongly tries to mix Steel production numbers (Fe_ton) with Electricity production numbers (TWh). It's good that the unit code caught it! Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: Michael Tiemann <[email protected]>

There was long-standing confusion about the meaning of GHG_SCOPE12 (which, when looked at through one functional path, seemed to depend first and only on production values, and when looked at other ways, seemed to represent emissions values). It was finally determined that this was, indeed, an emissions-based quantity, and the the production value pathway fed a ratio calculation that resolved to a dimensionless quantity (so it could be calculated just as well from emissions). In any case, these changes principally fix these and some other problems in the way various column names and variable names work and work together. Signed-off-by: Michael Tiemann <[email protected]>

Updated to work with with unit-aware code. Signed-off-by: Michael Tiemann <[email protected]>

…hancements Refactor _calculate_target_projections into BaseCompanyDataProvider and reorganize class definition order to accommodate. Also fix some latent unit errors in excel.py and test_excel_provider.py resulting from GHG_SCOPE12 fixes. Update quick example notebooks. Signed-off-by: Michael Tiemann <[email protected]>

One more row of data! Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…, Chemicals, Textiles). Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…(using m**2 right now). Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…OECM benchmarks. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

… guide S3 handling. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…enchmark-ingest. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

See #157 for long-term fix. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…. Now ready for Real Estate! Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

… add. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

…ata. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Company ids are critical, when converting from Excel to ICompanyData. If input Excel is missing company ids, then following conversion raises a hard-to-debug exception: ~~~~ ValueError: Shape of passed values is (265, 6), indices imply (260, 6) ~~~~ With this commit, we prevent further convertion, and raise a more understandable exception: ~~~~ ValueError: Missing company ids ~~~~ Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

If during the call to `_company_df_to_model` input Excel is missing company name, error log message says: ~~~~ ERROR - (One of) the input(s) of company <NA> is invalid ~~~~ This message could be less helpful, when looking for an invalid row in the Excel. This commit prints company id instead, to make a problematic row easy to find, when company name is missing: ~~~~ ERROR - (One of) the input(s) of company with ID US00130H1059 is invalid ~~~~ Company ids are used on erlier steps of Excel validation. Also, earlier steps validate, that company ids exist for all rows. Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

For this particular bemchmark, only S3 matters. If following commits, we use the absence of S1S2 as an indicator of S3-scope calculations Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Scope to calculate is usually S1S2, unless benchmark doesn't specify it. The second candidate is S3 If scope to calculate is S3, `DataWarehouse` shouldn't merge S3 into S1S2. Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Modify production benchmark input to `AnyScope` Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Add support for `scope` argument where necessary. Give benchmark scope_to_calc as argument. Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Use benchmark scope_to_calc for: * creation of `TemperatureScore` object * selection of scope content in the table in GUI Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

This allows us to calculate temperature score for companies, which provided S3 data. Otherwise, we will get an exception "The value for S3 is missing for the following companies" Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

The change only touches GUI, where the term 'scenario' was used for Emissions Intensity benchmark. This could confuse users Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

GitHub token was required for this notebook, but never used. At the same time, not all ITR users have a GitHub token, which made this notebook not executable for them. When i removed it, it's absense didn't effect execution of the notebook. Which means, that it can be removed safely, and make the notebook more accessible for ITR users. Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

This w/a resolves the exception: ~~~~ KeyError: "[('Europe', 'Construction Buildings', <EScope.S3: 'S3'>)] not in index" ~~~~ TODO: Remove this w/a, when associated EI benchmarks become available Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

The notebook separately loads EI benchmarks for S1S2 and S3, and separately calculates temperature scores for selected scopes. The output is 2 separate tables with separately calculated scores Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

This also includes an update to the organization name for pip-audit. "trailofbits" redirects to pypa now, and while this is functional for now, having the current name decreases the likelihood of problems down the line. Signed-off-by: Eric Ball <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Signed-off-by: Michael Tiemann <[email protected]>

MichaelTiemannOSC · 2023-02-15T18:39:00Z

The merge above was done hastily to fix long-standing DCO problems. It almost certainly won't run, but it will create a workable baseline from which the code can be resurrected.

dp90 · 2023-02-16T13:41:52Z

Some of the current errors I was able to resolve by installing some missing packages (added those to requirements.txt).

Regarding the credentials, I think a common practice is to include them as GitHub secrets (under the repo settings in Secrets and variables -> actions). In the CI/CD testing we could then add a step to the workflow .yaml file such as

name: Create .env file
run: |
echo "${{ secrets.ENV_FILE }}" > .env

The ENV_FILE secret would hold the environment variables in the same format as they are locally. What are your thoughts?

Signed-off-by: David Kroon <[email protected]>

MichaelTiemannOSC added this to the NZAOA Demo milestone May 5, 2022

MichaelTiemannOSC requested review from erikerlandson and dp90 May 5, 2022 13:19

MichaelTiemannOSC and others added 26 commits February 15, 2023 12:36

Added handlers for S1 and S2 targets

12e6241

Many companies report S1-only targets. We should handle that, and translate to S1S2 according to methodology. Signed-off-by: Michael Tiemann <[email protected]>

Update environment.yml

7255845

Added mentions for pint and pint-pandas, as well as latest pandas. Signed-off-by: Michael Tiemann <[email protected]>

Move target projection to class

2dbe086

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Fix getting latest target value when multiple targets in target proje…

8edb6fd

…ctor Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Update environment.yml

8a48e48

Added openscm-units Signed-off-by: Michael Tiemann <[email protected]>

Add S1 and S2 scopes to EI target projector class

7d0438b

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Copy comments to EITargetProjector and delete target_utils

4f18e1e

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Remove incorrect comment

6a0ec7d

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Test cases can now run with Sample data.

d55be13

Signed-off-by: Michael Tiemann <[email protected]>

First draft of runnable notebook demo

68af70d

Fix more data errors...now somewhat demoable (but strange results when target values are well above existing attainment, such as Duke Energy). Signed-off-by: Michael Tiemann <[email protected]>

Update environment.yml

b4663d3

Update openpyxl version number. Signed-off-by: Michael Tiemann <[email protected]>

Accept Asia as a region value

04ce7a1

The current benchmark data treats Asia as "Global" but that doesn't mean we cannot properly list Asia as a distinct region for display and aggregation purposes. Accordingly, change POSCO's region to Asia. Signed-off-by: Michael Tiemann <[email protected]>

Correcet CAGR calculation

ec8af16

Convert to base units before calculating magnitude (need to check elsewhere for this error!) and clamp CAGR to non-positive result. Signed-off-by: Michael Tiemann <[email protected]>

Update .gitignore

5a6f9e1

Signed-off-by: Michael Tiemann <[email protected]>

Update template.py

925317e

Connected to previous checkin: construct company_sector_region_info DataFrame using dictionary of [] not singleton elements to make Quantity work as ArrayExtension instead of dict. Signed-off-by: Michael Tiemann <[email protected]>

Cleanup and fixing of some tests

eb85f44

Signed-off-by: Michael Tiemann <[email protected]>

Update ITR_dash_app_develop.py

96f316d

Updated to work with with unit-aware code. Signed-off-by: Michael Tiemann <[email protected]>

Add PPL data and targets

e07cabb

One more row of data! Signed-off-by: Michael Tiemann <[email protected]>

Cleanup and add base year production data to test data json file

bbb5510

Signed-off-by: David Kroon <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Attempt at cleanup, renaming, fixing tests

488e550

Signed-off-by: Michael Tiemann <[email protected]>

MichaelTiemannOSC and others added 25 commits February 15, 2023 12:56

Fix wrong conflict resolution (variable root is otherwise undefined).

256a8f7

Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Add unit handlers for additional sectors (Aluminum, Buildings, Cement…

1634132

…, Chemicals, Textiles). Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Add tkm units for freight. Should think about metric for "Built m^2" …

1f88e63

…(using m**2 right now). Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Allow use of newly computed Production-Centric (PC) and vanilla (S3) …

5199c79

…OECM benchmarks. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Updated benchmarks with production_centric value (true or false) to…

ceddaa8

… guide S3 handling. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Pick up benchmark files with values that are better rounded by OECM-b…

aac042d

…enchmark-ingest. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Actually push new sector handling functionality into tool source code.

f2e7623

See #157 for long-term fix. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Disaggregate Buildings into Construction, Residential, and Commerical…

9b11d36

…. Now ready for Real Estate! Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Add CBRE and Balfour Beatty PLC data.

f41d66f

Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Finish fixes to test cases so CI/CD can work again.

112764e

Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Add benchmark files to test directory. Somehow missed in the previous…

becef2d

… add. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

Added zmin and zmax so heatmap in UI doesn't wrongly stretch to fit d…

f5b4595

…ata. Signed-off-by: MichaelTiemann <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

base_providers: Production benchmarks are relevant to any scope

892c710

Modify production benchmark input to `AnyScope` Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

base_providers: Provide scope as argument to different methods

b6cc3bd

Add support for `scope` argument where necessary. Give benchmark scope_to_calc as argument. Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

ITR_UI: Calculate for benchmark scope_to_calc

f5c2e17

Use benchmark scope_to_calc for: * creation of `TemperatureScore` object * selection of scope content in the table in GUI Signed-off-by: Kirill Marinushkin <[email protected]> Signed-off-by: Michael Tiemann <[email protected]>

MichaelTiemannOSC force-pushed the develop-project-vault-iceberg branch from b732e52 to 6d4cfa7 Compare February 15, 2023 18:29

Merge branch 'develop' into develop-project-vault-iceberg

a80dd4a

Signed-off-by: Michael Tiemann <[email protected]>

Update missing packages

2cb302a

Signed-off-by: David Kroon <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop project vault iceberg #89

Develop project vault iceberg #89

MichaelTiemannOSC commented May 5, 2022

MichaelTiemannOSC commented May 5, 2022

MichaelTiemannOSC commented Feb 15, 2023

dp90 commented Feb 16, 2023

Develop project vault iceberg #89

Are you sure you want to change the base?

Develop project vault iceberg #89

Conversation

MichaelTiemannOSC commented May 5, 2022

MichaelTiemannOSC commented May 5, 2022

MichaelTiemannOSC commented Feb 15, 2023

dp90 commented Feb 16, 2023