-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/property value as numeric #22
Conversation
Use regexp_replace to ensurre property_value is of correct format
Fix `property_value` in `stg_klaviyo__event.sql`
…fivetran/dbt_klaviyo_source into bugfix/property-value-as-numeric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-avinash this looks great and have no concerns regarding the data quality resulting from these changes! I do have one small suggested change before moving forward to make maintaining the codebase a bit easier going forward.
Let me know if you have any questions. Thanks!
models/stg_klaviyo__event.sql
Outdated
{% if target.type == 'bigquery' %} | ||
cast(regexp_replace(cast(property_value as {{ dbt.type_string() }}), r'[^0-9.]*', '') as {{ dbt.type_numeric() }}) as numeric_value, | ||
{% elif target.type == 'postgres' %} | ||
cast(regexp_replace(cast(property_value as {{ dbt.type_string() }}), '[^0-9.]*', '', 'g') as {{ dbt.type_numeric() }}) as numeric_value, | ||
{% else %} | ||
cast(regexp_replace(cast(property_value as {{ dbt.type_string() }}), '[^0-9.]*', '') as {{ dbt.type_numeric() }}) as numeric_value, | ||
{% endif %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can consider making this a macro in the package (to then possibly move to fivetran_utils) instead of leveraging conditionals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Macro created, let me know if it all looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-joemarkiewicz Requested changes applied!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-avinash thanks for creating the macro. I do have a comment around changing the framework of the macro to follow our adapter.dispatch
standard which we leverage in fivetran_utils. Let me know if you have any questions.
{% macro remove_string_from_numeric(column_name) %} | ||
|
||
{% if target.type == 'bigquery' %} | ||
cast(regexp_replace(cast({{ column_name }} as {{ dbt.type_string() }}), r'[^0-9.]*', '') as {{ dbt.type_numeric() }}) | ||
{% elif target.type == 'postgres' %} | ||
cast(regexp_replace(cast({{ column_name }} as {{ dbt.type_string() }}), '[^0-9.]*', '', 'g') as {{ dbt.type_numeric() }}) | ||
{% else %} | ||
cast(regexp_replace(cast({{ column_name }} as {{ dbt.type_string() }}), '[^0-9.]*', '') as {{ dbt.type_numeric() }}) | ||
{% endif %} | ||
|
||
{% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-avinash this is the right idea. However, when creating a macro that is cross warehouse compatibile we don't want to leverage the hard target.type
conditional coding logic. Especially if there are plans to one day have this integrated into fivetran_utils. Instead, we want to leverage the adapter.dispatch
to handle these operations for us.
Would you be able to rework this macro to follow the adapter.dispatch
framework of dispatching, as opposed to using the target.type
conditionals? You can find an example of an adapter.dispatch framework macro here. Your macro will look similar, but will follow the new macro format for each warehouse. However, you will be able to leverage the default__
macro as the same as your else
version. Let me know if you have any questions.
models/stg_klaviyo__event.sql
Outdated
@@ -35,10 +35,9 @@ rename as ( | |||
cast(person_id as {{ dbt.type_string() }} ) as person_id, | |||
type, | |||
uuid, | |||
property_value as numeric_value, | |||
{{ remove_string_from_numeric('property_value') }} as numeric_value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you make the previous comment updates, you will need to change this to the following:
{{ remove_string_from_numeric('property_value') }} as numeric_value, | |
{{ klaviyo_source.remove_string_from_numeric('property_value') }} as numeric_value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-joemarkiewicz Good call-out, changes applied!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fivetran-avinash thanks for making these final adjustments. This looks good to go!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved for release!
PR Overview
This PR will address the following Issue/Feature: [#20]
This PR will result in the following new package version: 0.7.1
Not a breaking change--it should update field values but not the name itself.
Please provide the finalized CHANGELOG entry which details the relevant changes included in this PR:
🪲 Bug Fixes 🪛
property_value
, leading to database errors.stg_klaviyo__event
, we castproperty_value
as a string, used aregex_replace
function to retain only numerical values in these strings across all destinations (i.e. 0-9 values and .), then cast back to a numeric to ensurenumeric_value
was of that data type.🚘 Under the Hood 🚘
property_value
in theintegration_tests/dbt_project.yml
to ensure the field was originally being cast as a string or varchar data type for testing purposes.event
seed file to test for values that aren't numerics.Contributors
PR Checklist
Basic Validation
Please acknowledge that you have successfully performed the following commands locally:
Before marking this PR as "ready for review" the following have been applied:
Detailed Validation
Please share any and all of your validation steps:
We tested across all destinations that (1) the seed file is showing values of various types (normal numerical, numeric type with a $ type, null value), and then made sure that the regex_replace removed the $ expression and left only the numeric values.