Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedup transactions #277

Merged
merged 6 commits into from
Oct 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion models/staging/recommended_events/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,14 @@ models:
+enabled: true
```

Not all recommended events have been implemented. If you need a specific event, please consider creating a pull request with the model that you need in the [dbt-ga4 GitHub repository](https://github.com/Velir/dbt-ga4).
Not all recommended events have been implemented. If you need a specific event, please consider creating a pull request with the model that you need in the [dbt-ga4 GitHub repository](https://github.com/Velir/dbt-ga4).

## Purchase Event Transaction Deduplication

The `stg_ga4__event_purchase_deduplicated` model builds on the `sgt_ga4__event_purchase` model. It is disabled by default and thus needs to be enabled along with the `stg_ga4__event_purchase` model.

The model only processes purchase events that fall within the window as defined by `static_incremental_days` and can only reliably be expected to deduplicate purchase events occurring in the same day.

The model provides a highly-performant, minimum-viable product for this feature returning only data from the first purchase event with a matching `transaction_id` within the processing window.

You are encouraged to copy this model to your project and customize it there should this MVP be insufficient for your needs.
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{% if not flags.FULL_REFRESH %}
{% set partitions_to_query = ['current_date'] %}
{% for i in range(var('static_incremental_days', 1)) %}
{% set partitions_to_query = partitions_to_query.append('date_sub(current_date, interval ' + (i+1)|string + ' day)') %}
{% endfor %}
{% endif %}

{{
config(
enabled = false,
)
}}
with purch as (
select
*
from {{ref('stg_ga4__event_purchase')}}
{% if not flags.FULL_REFRESH %}
where event_date_dt in ({{ partitions_to_query | join(',') }})
{% endif %}
qualify row_number() over(
partition by transaction_id
order by event_timestamp
) = 1
)
select
*
from purch