Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/property value as numeric #22

Merged
merged 34 commits into from
Apr 29, 2024

Conversation

fivetran-avinash
Copy link
Contributor

@fivetran-avinash fivetran-avinash commented Apr 24, 2024

PR Overview

This PR will address the following Issue/Feature: [#20]

This PR will result in the following new package version: 0.7.1

Not a breaking change--it should update field values but not the name itself.

Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:

🪲 Bug Fixes 🪛

  • There were errors customers were encountering where numeric values were not being recognized in property_value, leading to database errors.
  • To solve for that, in stg_klaviyo__event, we cast property_value as a string, used a regex_replace function to retain only numerical values in these strings across all destinations (i.e. 0-9 values and .), then cast back to a numeric to ensure numeric_value was of that data type.

🚘 Under the Hood 🚘

  • Cast property_value in the integration_tests/dbt_project.yml to ensure the field was originally being cast as a string or varchar data type for testing purposes.
  • Updated the event seed file to test for values that aren't numerics.
  • Updated the pull request template.

Contributors

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt run –full-refresh && dbt test
  • [NA] dbt run (if incremental models are present)

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked, tagged, and properly assigned
  • All necessary documentation and version upgrades have been applied
  • docs were regenerated (unless this PR does not include any code or yml updates)
  • BuildKite integration tests are passing
  • Detailed validation steps have been provided below

Detailed Validation

Please share any and all of your validation steps:

We tested across all destinations that (1) the seed file is showing values of various types (normal numerical, numeric type with a $ type, null value), and then made sure that the regex_replace removed the $ expression and left only the numeric values.

@fivetran-avinash fivetran-avinash self-assigned this Apr 24, 2024
@fivetran-avinash fivetran-avinash linked an issue Apr 26, 2024 that may be closed by this pull request
1 task
@fivetran-avinash fivetran-avinash marked this pull request as ready for review April 26, 2024 16:30
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash this looks great and have no concerns regarding the data quality resulting from these changes! I do have one small suggested change before moving forward to make maintaining the codebase a bit easier going forward.

Let me know if you have any questions. Thanks!

Comment on lines 38 to 44
{% if target.type == 'bigquery' %}
cast(regexp_replace(cast(property_value as {{ dbt.type_string() }}), r'[^0-9.]*', '') as {{ dbt.type_numeric() }}) as numeric_value,
{% elif target.type == 'postgres' %}
cast(regexp_replace(cast(property_value as {{ dbt.type_string() }}), '[^0-9.]*', '', 'g') as {{ dbt.type_numeric() }}) as numeric_value,
{% else %}
cast(regexp_replace(cast(property_value as {{ dbt.type_string() }}), '[^0-9.]*', '') as {{ dbt.type_numeric() }}) as numeric_value,
{% endif %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can consider making this a macro in the package (to then possibly move to fivetran_utils) instead of leveraging conditionals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Macro created, let me know if it all looks good!

Copy link
Contributor Author

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Requested changes applied!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash thanks for creating the macro. I do have a comment around changing the framework of the macro to follow our adapter.dispatch standard which we leverage in fivetran_utils. Let me know if you have any questions.

Comment on lines 1 to 11
{% macro remove_string_from_numeric(column_name) %}

{% if target.type == 'bigquery' %}
cast(regexp_replace(cast({{ column_name }} as {{ dbt.type_string() }}), r'[^0-9.]*', '') as {{ dbt.type_numeric() }})
{% elif target.type == 'postgres' %}
cast(regexp_replace(cast({{ column_name }} as {{ dbt.type_string() }}), '[^0-9.]*', '', 'g') as {{ dbt.type_numeric() }})
{% else %}
cast(regexp_replace(cast({{ column_name }} as {{ dbt.type_string() }}), '[^0-9.]*', '') as {{ dbt.type_numeric() }})
{% endif %}

{% endmacro %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash this is the right idea. However, when creating a macro that is cross warehouse compatibile we don't want to leverage the hard target.type conditional coding logic. Especially if there are plans to one day have this integrated into fivetran_utils. Instead, we want to leverage the adapter.dispatch to handle these operations for us.

Would you be able to rework this macro to follow the adapter.dispatch framework of dispatching, as opposed to using the target.type conditionals? You can find an example of an adapter.dispatch framework macro here. Your macro will look similar, but will follow the new macro format for each warehouse. However, you will be able to leverage the default__ macro as the same as your else version. Let me know if you have any questions.

@@ -35,10 +35,9 @@ rename as (
cast(person_id as {{ dbt.type_string() }} ) as person_id,
type,
uuid,
property_value as numeric_value,
{{ remove_string_from_numeric('property_value') }} as numeric_value,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you make the previous comment updates, you will need to change this to the following:

Suggested change
{{ remove_string_from_numeric('property_value') }} as numeric_value,
{{ klaviyo_source.remove_string_from_numeric('property_value') }} as numeric_value,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated!

Copy link
Contributor Author

@fivetran-avinash fivetran-avinash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-joemarkiewicz Good call-out, changes applied!

Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-avinash thanks for making these final adjustments. This looks good to go!

Copy link
Contributor

@fivetran-catfritz fivetran-catfritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved for release!

@fivetran-avinash fivetran-avinash merged commit 1faa6df into main Apr 29, 2024
7 checks passed
@fivetran-avinash fivetran-avinash deleted the bugfix/property-value-as-numeric branch April 29, 2024 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Formatting property_value field in stg_klaviyo__event model
4 participants