Refactor Belgium DAG for template to generalise to other countries #95

sgreenbury · 2024-05-17T15:41:00Z

From discussion with @yongrenjie as part of #92

Currently the individual census tables are filtered through the used of needed datasets and a corresponding partition.

As begun in #92 (see this section) the config for derived columns can be expanded to include:

Geography level
Aggregation column (IMO just define a (DF -> DF) function that gets called to generate the new statistic)

To enable the above, the type for derivation config (currently: dict[str, tuple[str, list[DerivedColumn]]]) can be updated to include the extra required items.

This could be something like:

# One per derived table
class DerivedColumn:
    hxltag: str
    aggregation_func: Callable[[pd.DataFrame], pd.DataFrame]
    output_column_name: str
    human_readable_name: str

# One per source table
class MetricDerivationInstructions:
   geography_level: str
   geo_id_col_name: str
   derived_columns: list[DerivedColumn]

Also see if needed_datasets + source_metrics assets can be skipped entirely.

Following any refactoring this pattern should be readily applicable to other countries to be updated in the pipeline (e.g. Scotland, NI, England/Wales, USA) new countries being added that conform to this DAG pattern for how the data is provided.

The text was updated successfully, but these errors were encountered:

sgreenbury · 2024-06-19T13:59:48Z

The original aim of issue is superseded in porting Northern Ireland #98. Consider whether to keep open for incorporating all other census tables as metrics (@andrewphilipsmith for reference)

…ium-class Refactor Belgium to use new Country class (#95)

sgreenbury mentioned this issue May 17, 2024

update output types of publishing assets, add cloud sensors #92

Merged

4 tasks

sgreenbury mentioned this issue Jun 7, 2024

Port Northern Ireland #98

Merged

4 tasks

sgreenbury added a commit that referenced this issue Jul 2, 2024

Merge pull request #128 from Urban-Analytics-Technology-Platform/belg…

f731e54

…ium-class Refactor Belgium to use new Country class (#95)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Belgium DAG for template to generalise to other countries #95

Refactor Belgium DAG for template to generalise to other countries #95

sgreenbury commented May 17, 2024

sgreenbury commented Jun 19, 2024

Refactor Belgium DAG for template to generalise to other countries #95

Refactor Belgium DAG for template to generalise to other countries #95

Comments

sgreenbury commented May 17, 2024

sgreenbury commented Jun 19, 2024