Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Fix concatenation issue with null values creating null rows in Snowflake #19

Merged
merged 16 commits into from
Feb 4, 2025

Conversation

fivetran-avinash
Copy link
Contributor

@fivetran-avinash fivetran-avinash commented Jan 28, 2025

PR Overview

This PR will address the following Issue/Feature: [#20]

This PR will result in the following new package version: v0.1.0-a6

This ensures null fields gets populated but shouldn't change the schema.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

Bug Fixes (requires --full-refresh)

  • Applied coalesce_cast macro to all relevant fields that are being concatenated into comment_markdown, as any concatenation in Snowflake with a null value returns null. We coalesced 'UNKNOWN' on a string field, and '1970-01-01 00:00:00' on a timestamp field to ensure Snowflake returns chunks of texts for all comments with null components.
  • Fields are now coalesced in these intermediate models:
    • Hubspot
      • int_rag_hubspot__deal_comment_document: email_title and body (string fields), comment_time (timestamp field).
      • int_rag_hubspot__deal_document: title (string field) and created_on (timestamp field).
    • Jira
      • int_rag_jira__issue_comment_document: comment_body (string field) and comment_time (timestamp field).
      • int_rag_jira__issue_document: title (string field) and created_on (timestamp field).
    • Zendesk
      • int_rag_zendesk__ticket_comment_document: comment_body (string field) and comment_time (timestamp field).
      • int_rag_zendesk__ticket_document: title (string field) and created_on (timestamp field).
  • Corrected syntax errors for the default_variable in stg_rag_hubspot__engagement_email and stg_rag_hubspot__engagement_note.
  • Updated joins to ensure engagement_deal is the base in the int_rag_hubspot__deal_comment_document CTEs.

Under the Hood

  • Updated Hubspot seed files to ensure proper joins on end models.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • [NA] dbt run (if incremental models are present) && dbt test

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

Was able to reproduce the issue in Snowflake by running the compiled code with a null value that is concatenated and it returning nulls there. Adding the coalesces with the compiled code did produce the full row span expected.

The lone validation test worked too.

Screenshot 2025-01-29 at 1 58 31 PM

If you had to summarize this PR in an emoji, which would it be?

🪹

@fivetran-avinash fivetran-avinash marked this pull request as ready for review January 29, 2025 19:31
@fivetran-avinash fivetran-avinash changed the title Union Variable Syntax [Bug] Fix concatenation issue with null values creating null rows in Snowflake Jan 29, 2025
Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash great work on this PR and thanks for uncovering and addressing the list of bugs in the model. I have a few more comments before approval. Once these are addressed this will be ready for pre-release.

integration_tests/dbt_project.yml Outdated Show resolved Hide resolved
integration_tests/dbt_project.yml Outdated Show resolved Hide resolved
Copy link
Contributor Author

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Changes addressed!

Copy link
Collaborator

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one last comment to update README

@@ -1,6 +1,28 @@
# dbt_unified_rag v0.1.0-a6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies I didn't realize this before. Be sure to update the install version in the README with the new -a6 version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@fivetran-avinash fivetran-avinash merged commit bc4496b into main Feb 4, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants