You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the individual census tables are filtered through the used of needed datasets and a corresponding partition.
As begun in #92 (see this section) the config for derived columns can be expanded to include:
Geography level
Aggregation column (IMO just define a (DF -> DF) function that gets called to generate the new statistic)
To enable the above, the type for derivation config (currently: dict[str, tuple[str, list[DerivedColumn]]]) can be updated to include the extra required items.
This could be something like:
# One per derived tableclassDerivedColumn:
hxltag: straggregation_func: Callable[[pd.DataFrame], pd.DataFrame]
output_column_name: strhuman_readable_name: str# One per source tableclassMetricDerivationInstructions:
geography_level: strgeo_id_col_name: strderived_columns: list[DerivedColumn]
Also see if needed_datasets + source_metrics assets can be skipped entirely.
Following any refactoring this pattern should be readily applicable to other countries to be updated in the pipeline (e.g. Scotland, NI, England/Wales, USA) new countries being added that conform to this DAG pattern for how the data is provided.
The text was updated successfully, but these errors were encountered:
The original aim of issue is superseded in porting Northern Ireland #98. Consider whether to keep open for incorporating all other census tables as metrics (@andrewphilipsmith for reference)
From discussion with @yongrenjie as part of #92
Currently the individual census tables are filtered through the used of needed datasets and a corresponding partition.
As begun in #92 (see this section) the config for derived columns can be expanded to include:
To enable the above, the type for derivation config (currently:
dict[str, tuple[str, list[DerivedColumn]]]
) can be updated to include the extra required items.This could be something like:
Also see if
needed_datasets
+source_metrics
assets can be skipped entirely.Following any refactoring this pattern should be readily applicable to other countries to be updated in the pipeline (e.g. Scotland, NI, England/Wales, USA) new countries being added that conform to this DAG pattern for how the data is provided.
The text was updated successfully, but these errors were encountered: