Create credible temp score using RMI data #16

MichaelTiemannOSC · 2021-12-26T23:40:03Z

If I have read the code, the diagram, and the methodology documentation correctly, It appears that there was quite some ambiguity in how emissions vs. production data was being handled, and a variety of other problems hiding therein. A sign that something was wrong was temperature scores than ranged from a fraction of a degree to several hundred degrees(!).

This PR pervasively changes the EScope (emissions scope) type label to PScope (projection scope). This can be further renamed if there's a another wording that's preferred (of if there's actually a second branch of logic that we should fully pull apart from the main branch).

Regardless of the EScope/PScope naming, there were several other changes needed to make production projections their own thing (and not confused with emissions projections).

The to_numpy() casts in DataWarehouse were covering up an egregious error in keeping production projections and emissions/intensity projections aligned. The rows are no longer scrambled.

Included are files based on original portfolio data but also based on real RMI data (using 2019 as a base year). As you can see by the Notebook file, there are some US-based utilities that are not Paris-aligned (or the data we have doesn't show them in very good light), but most of the temperature scores are in the 2-3 degree range, which is credible.

Comments welcome!

Also disable Steel sector, for which RMI provides no data.

The previous S1+S2 did not correct MMT vs MT as the unit of measure for NOX emissions.

It appeared that there was quite some ambiguity in how emissions vs. production data was being handled. Pervasively changed ESCope (emissions scope) to PScope (projection scope). This can be further renamed if there's a better wording for it. Regardless of the EScope/PScope naming, there were several other changes needed to make production projections their own thing (and not confused with emissions projections). The to_numpy() casts in DataWarehouse were covering up an egregious error in keeping production projections and emissions/intensity projections aligned. The rows are no longer scrambled. Included are files based on original portfolio data but also based on real RMI data (using 2019 as a base year). Comments welcome!

MichaelTiemannOSC · 2021-12-27T08:27:44Z

As @BertKramer explained to me, it may be that I was overzealous in terms of preferring company production projections over sectoral benchmark projections. If that's the case, then this pull request can be greatly simplified to just the part that fixes the calculations in get_preprocessed_company_data in the file data_warehouse.py.

If, however, there always was a plan to use company production projections, as an option or as a preference, this pull request lays groundwork for that implementation.

ITR data pipeline now properly uses ISO3166 to put countries in the correct regions (with the help of ESSD's UN region definitions).

dp90

I also talked to Bert Kramer about the production projections, and it seems best, for now, to not merge that part back, until the relevant methodology parts have been discussed.
Not sure what the easiest way is to merge part of this PR back, while keeping the remainder, but I'd be happy to help out in any way I can.

As I already started, I did finish the review of the entire PR, so hopefully it's still useful for future reference.

dp90 · 2021-12-29T14:42:12Z

ITR/data/base_providers.py

                          self.column_config.GHG_SCOPE12]]
        ei_at_base = self._get_company_intensity_at_year(base_year, company_ids).rename(self.column_config.BASE_EI)
+        # print(f"BA: company_info.loc[] = {company_info.loc['US0185223007']}")


Remove commented code

dp90 · 2021-12-29T14:44:42Z

ITR/data/base_providers.py

-        get the projected productions for list of companies in ghg_scope12
-        :param ghg_scope12: DataFrame with at least the following columns :
-        ColumnsConfig.COMPANY_ID,ColumnsConfig.GHG_SCOPE12, ColumnsConfig.SECTOR and ColumnsConfig.REGION
+        get the projected productions for list of companies (PRODUCTIONS not S1S2)


"(PRODUCTIONS not S1S2)" might be superfluous

dp90 · 2021-12-29T14:46:17Z

ITR/data/data_warehouse.py

@@ -1,4 +1,4 @@
-from abc import ABC
+from abc import ABC # _project


Consider removing "# _project"

dp90 · 2021-12-29T14:47:27Z

ITR/data/data_warehouse.py

        assert pd.Series(company_ids).isin(df_company_data.loc[:, self.column_config.COMPANY_ID]).all(), \
            "some of the company ids are not included in the fundamental data"

        company_info_at_base_year = self.company_data.get_company_intensity_and_production_at_base_year(company_ids)
+        # print(f"DW: company_info_at_base_year.loc[] = {company_info_at_base_year.loc['US0185223007']}")


Remove commented code

dp90 · 2021-12-29T14:48:28Z

ITR/data/data_warehouse.py

+        # print(f"BUDG:\n{df_company_data.loc[df_company_data.index<40,['company_id',self.column_config.CUMULATIVE_BUDGET]]}\n\n")
+        # print(f"CIABY:\n{company_info_at_base_year.loc[df_company_data.index<40,:]}\n\n")
+        # print(f"""SDA:\n{self.benchmarks_projected_emission_intensity.get_SDA_intensity_benchmarks(
+        #         company_info_at_base_year).loc[df_company_data.index<40,:]}\n\n""")


Remove commented code

dp90 · 2021-12-29T14:49:39Z

ITR/data/data_warehouse.py

+        # print(projected_emission_intensity.index[0:3])
+        # print(projected_emission_intensity.iloc[0:3])
+        # print(projected_production.index[0:3])
+        # print(projected_production.iloc[0:3])


Remove commented code

dp90 · 2021-12-29T14:53:42Z

ITR/data/excel.py

@@ -30,7 +30,7 @@ def convert_benchmark_excel_to_model(df_excel: pd.DataFrame, sheetname: str, col
        result.append(bm)
    return IBenchmarks(benchmarks=result)

-
+# ??? This duplicates info from 


I don't think it duplicates: it refers to tabs of an excel file rather than columns. We could consider moving it to configs.py.
Comment should be removed, I think.

dp90 · 2021-12-29T14:54:40Z

ITR/portfolio_aggregation.py

-                use_S1S2 = (data[self.c.COLS.SCOPE] == EScope.S1S2) | (data[self.c.COLS.SCOPE] == EScope.S1S2S3)
-                use_S3 = (data[self.c.COLS.SCOPE] == EScope.S3) | (data[self.c.COLS.SCOPE] == EScope.S1S2S3)
+                use_S1S2 = (data[self.c.COLS.SCOPE] == PScope.S1S2) | (data[self.c.COLS.SCOPE] == PScope.S1S2S3)
+                use_S3 = (data[self.c.COLS.SCOPE] == PScope.S3) | (data[self.c.COLS.SCOPE] == PScope.S1S2S3)


I like what you did on line 95. We could consider doing the same here.

MichaelTiemannOSC added 5 commits December 25, 2021 14:04

Change filenames to use RMI data

3d1a26e

Also disable Steel sector, for which RMI provides no data.

Sample data for RMI ITR analysis

03006c4

Updated with fixed NOX calculation

ae9c44d

The previous S1+S2 did not correct MMT vs MT as the unit of measure for NOX emissions.

Fix typo

3e42e0e

MichaelTiemannOSC requested a review from joriscram December 26, 2021 23:40

Updated with proper region info for National Grid

d2da6c4

ITR data pipeline now properly uses ISO3166 to put countries in the correct regions (with the help of ESSD's UN region definitions).

dp90 reviewed Dec 29, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create credible temp score using RMI data #16

Create credible temp score using RMI data #16

MichaelTiemannOSC commented Dec 26, 2021

MichaelTiemannOSC commented Dec 27, 2021

dp90 left a comment

dp90 Dec 29, 2021

dp90 Dec 29, 2021

dp90 Dec 29, 2021

dp90 Dec 29, 2021

dp90 Dec 29, 2021

dp90 Dec 29, 2021

dp90 Dec 29, 2021

dp90 Dec 29, 2021

		@@ -1,4 +1,4 @@
		from abc import ABC
		from abc import ABC # _project

Create credible temp score using RMI data #16

Are you sure you want to change the base?

Create credible temp score using RMI data #16

Conversation

MichaelTiemannOSC commented Dec 26, 2021

MichaelTiemannOSC commented Dec 27, 2021

dp90 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment