Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add prefix and suffix arguments to star macro #436

Merged
merged 3 commits into from
Nov 26, 2021

Conversation

fivetran-jamie
Copy link
Contributor

This is a:

  • bug fix PR with no breaking changes — please ensure the base branch is master
  • new functionality — please ensure the base branch is the latest dev/ branch
  • a breaking change — please ensure the base branch is the latest dev/ branch

Description & motivation

Just added optional prefix and suffix arguments to the star macro, so that the output fields will all be renamed as prefix ~ name ~ suffix (very similar to the pivot macro).

This is helpful to me in that I can more easily know where columns came from. Morewover, I have a particular case where I am using star twice in the same CTE, and the columns included in each source relation that I'm joining/selecting from can have custom names. If there's any overlap between the two sources, we'll see an ambiguous column error without any prefix or suffix to distinguish them

Checklist

  • I have verified that these changes work locally on the following warehouses (Note: it's okay if you do not have access to all warehouses, this helps us understand what has been covered)
    • BigQuery
    • Postgres
    • Redshift
    • Snowflake
  • I have "dispatched" any new macro(s) so non-core adapters can also use them (e.g. the star() source) is this just adding the arguments to the dispatch line?
  • I have updated the README.md (if applicable)
  • I have added tests & descriptions to my models (and macros if applicable)
  • I have added an entry to CHANGELOG.md

@joellabes
Copy link
Contributor

Hey @fivetran-jamie, this makes sense to me! Happy to welcome it into the fold.

Could you add another test in the same vein as the existing one (model.yml, model, seed) to verify that they are behaving the way you expect?

dbt 1.0.0-rc1 is imminent, and so we'll be cutting utils 0.7.4 and 0.8.0 in the next week or so. Would love to sneak this in as well!

@joellabes joellabes linked an issue Nov 10, 2021 that may be closed by this pull request
@fivetran-jamie
Copy link
Contributor Author

hey @joellabes happy to do so! i'm a little confused by the test files though (this is my first dbt-utils PR 😅 ) -- are there guidelines somewhere on what to add?

@joellabes
Copy link
Contributor

@fivetran-jamie There aren't (yet) sorry! In short, there's another dbt project inside of this repo called integration_tests. It has a dependency on this folder and Circle CI runs the integration tests project on each of the core 4 DWHs to check everything's OK.

In this case, you'd want to do something along these lines:

  • Make a model called test_star_prefix_suffix.sql which also selects from {{ ref('data_star') }} - basically the same as test_star
  • Instead of doing an exclude as the existing one does, it would pass prefix/suffix args in.
  • Then add a seed file called data_star_prefix_suffix_expected along the same lines as data_star_expected
  • And add another test in the schema.yml file:
  - name: test_star_prefix_suffix
    tests:
      - dbt_utils.equality:
          compare_model: ref('data_star_prefix_suffix_expected')

Then when you push it up, the CI will check everything for you. if it fails, you can't actually see why 😢 just @ me and I'll check it out for you.

Hope that gets you started!

@joellabes
Copy link
Contributor

@fivetran-jamie three outta four ain't bad!

The CI error is:

Database Error in test dbt_utils_equality_test_star_prefix_suffix_ref_data_star_prefix_suffix_expected_ (models/sql/schema.yml)
  000904 (42000): SQL compilation error: error line 47 at position 11
  invalid identifier '"prefix_FIELD_1_suffix"'
  compiled SQL at target/run/dbt_utils_integration_tests/models/sql/schema.yml/dbt_utils_equality_test_star_p_451b32d440e658c73dd0373a328e9b28.sql
Encountered an error:
FailFast Error in test dbt_utils_equality_test_star_prefix_suffix_ref_data_star_prefix_suffix_expected_ (models/sql/schema.yml)
  Failing early due to test failure or runtime error

Looks like a casing issue to me? (shakes fist at Snowflake)

@fivetran-jamie
Copy link
Contributor Author

darn you snowflake...trying out the target-conditional casing used in https://github.com/dbt-labs/dbt-utils/blob/main/integration_tests/models/sql/test_pivot.sql

@fivetran-jamie
Copy link
Contributor Author

huzzah!

@joellabes
Copy link
Contributor

trying out the target-conditional casing used in https://github.com/dbt-labs/dbt-utils/blob/main/integration_tests/models/sql/test_pivot.sql

I hate it but I love it 💀 it feels very backwards to change the test to pass the reality, but the fact that there's prior art puts my mind a bit more at ease.

I'll give it a test locally before I properly merge it in, but this is looking good! thank you 🌟

@joellabes joellabes removed the pending label Nov 17, 2021
@fivetran-jamie
Copy link
Contributor Author

yah it didn't feel good to add that lol -- happy to investigate other solutions if needed, just came across this one first 🤷‍♂️

also thanks for laying out everything so clearly!

@fivetran-jamie
Copy link
Contributor Author

hm would it be best to automatically capitalize the prefix + suffix on snowflake warehouses? ie

{% macro star(from, relation_alias=False, except=[], prefix='', suffix='') -%}
    {{ return(adapter.dispatch('star', 'dbt_utils')(from, relation_alias, except, prefix, suffix)) }}
{% endmacro %}

{% macro default__star(from, relation_alias=False, except=[], prefix='', suffix='') -%}
    {%- do dbt_utils._is_relation(from, 'star') -%}
    {%- do dbt_utils._is_ephemeral(from, 'star') -%}

    {#-- Prevent querying of db in parsing mode. This works because this macro does not create any new refs. #}
    {%- if not execute -%}
        {{ return('') }}
    {% endif %}

    {%- set include_cols = [] %}
    {%- set cols = adapter.get_columns_in_relation(from) -%}
    {%- set except = except | map("lower") | list %}
    {%- for col in cols -%}

        {%- if col.column|lower not in except -%}
            {% do include_cols.append(col.column) %}

        {%- endif %}
    {%- endfor %}

    {% set prefix_cased = prefix | upper if target.type == 'snowflake' else prefix %} -- <--- here
    {% set suffix_cased = suffix | upper if target.type == 'snowflake' else suffix %} -- <--- here

    {%- for col in include_cols %}

        {%- if relation_alias %}{{ relation_alias }}.{% else %}{%- endif -%}{{ adapter.quote(col)|trim }} as {{ adapter.quote(prefix_cased ~ col ~ suffix_cased)|trim }}
        {%- if not loop.last %},{{ '\n  ' }}{% endif %}

    {%- endfor -%}
{%- endmacro %}

@joellabes
Copy link
Contributor

hm would it be best to automatically capitalize the prefix + suffix on snowflake warehouses
@fivetran-jamie I don't think so - in some circumstances (with explicit quoting for example) Snowflake can support lowercase. The hack in the tests is because the test suite doesn't behave that way, but the seed is lowercase. It doesn't feel good to change the behaviour in the macro itself

One day I'd like to flesh out the test suite to cover more variations of quoting behaviour, but that would require another warehouse connection etc so I'm not in a rush to do that 😬

@joellabes joellabes changed the base branch from main to next/minor November 26, 2021 01:23
Copy link
Contributor

@joellabes joellabes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 Thanks for the contribution!

@DataGuru2023
Copy link

DataGuru2023 commented Jun 7, 2024

Hi,
I used the macro as below and it generates a SQL that is not compatible with the database. The alias and the trim() are reversed behaviour -> BOOKINGCMB2 as trim(BOOKINGCMB2)
desired behaviour -> trim(BOOKINGCMB2) as BOOKINGCMB2
The intent was to use trim function for all source columns to remove any spaces

//Macro usage
select
{{ dbt_utils.star(from=ref('saviom_outboxbooking'),prefix='trim(', suffix=')',quote_identifiers=False) }}
from {{ ref('dev_table') }}
where DBT_VALID_TO is null

//output
SELECT
BOOKINGCMB2 as trim(BOOKINGCMB2),
BOOKINGCMB3 as trim(BOOKINGCMB3),
BOOKINGCMB4 as trim(BOOKINGCMB4),
BOOKINGCMB5 as trim(BOOKINGCMB5),
USERCMB1 as trim(USERCMB1),
PROJECTACTIVECLOSEIND as trim(PROJECTACTIVECLOSEIND)
from DEV.TABLE
where DBT_VALID_TO is null

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add prefix param to the star macro
3 participants