Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized and extensible pattern for incremental_strategy #2366

Closed
jtcohen6 opened this issue Apr 28, 2020 · 3 comments
Closed

Generalized and extensible pattern for incremental_strategy #2366

jtcohen6 opened this issue Apr 28, 2020 · 3 comments
Labels
enhancement New feature or request incremental Incremental modeling with dbt stale Issues that have gone stale

Comments

@jtcohen6
Copy link
Contributor

Describe the feature

Based on the way we've implemented incremental_strategy in Snowflake, BigQuery, and Spark, users who wish to add their own incremental strategies need to override both:

  • the dbt_[adapter]_validate_get_incremental_strategy macro
  • the materialization itself, where it consumes the strategy and calls a requisite DDL macro (get_merge_sql , get_delete_insert_merge_sql , get_insert_overwrite_sql, get_insert_overwrite_merge_sql).

What would be really cool is a centralized macro that:

  • enumerates the available strategies, which can then be used to validate the model config
  • maps each strategy to a "build_sql" macro, to be called by the materialization

E.g. It could look something like:

{% macro dbt_bigquery_get_incremental_strategy() %}
  {% set strategies = [
    { 'strategy': 'merge', 'build_sql_macro_name': 'get_merge_sql' },
    { 'strategy': 'insert_overwrite', 'build_sql_macro_name': 'bq_insert_overwrite' }
  ] %}
{% endmacro %}

Which I could then override locally to be:

{% macro dbt_bigquery_get_incremental_strategy() %}
  {% set strategies = [
    { 'strategy': 'merge', 'build_sql_macro_name': 'get_merge_sql' },
    { 'strategy': 'insert_overwrite', 'build_sql_macro_name': 'bq_insert_overwrite' },
    { 'strategy': 'my_custom_strategy', 'build_sql_macro_name': 'my_custom_merge_macro' }
  ] %}
{% endmacro %}

I'm not sure how exactly we could handle the different keyword arguments needed by each. Possibly one of:

  • The macro above also defines the full signature of the macro, which the materialization uses to figure out which keyword arguments to pass along (feels fragile)
  • We establish a common contract for the class of incremental "build_sql" macros (hard but straightforward)

Describe alternatives you've considered

This works today! The introduction of incremental_strategy has empowered several analysts and data engineers who want to write custom DML to do so in a more dbtonic way. It just requires a lot more copy-paste than it should.

Additional context

Relevant databases: Snowflake, BigQuery, Spark (so far)

Who will this benefit?

  • More advanced users
  • Organizations with bigger or more complex datasets, for whom the builtin incremental strategies are lacking in precise ways
@visch
Copy link

visch commented May 19, 2021

Love this, and wanted to add an idea due to an issue I ran into

materialization strategy not implemented : If a down stream adapter (In my case msftsql) doesn't implement a merge strategy (in my case merge) there should be some indication that this is the case. Not sure if a compiler exception is warranted, but at the minimum a WARNing log!

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Apr 27, 2022
@github-actions
Copy link
Contributor

github-actions bot commented May 4, 2022

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

@github-actions github-actions bot closed this as completed May 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request incremental Incremental modeling with dbt stale Issues that have gone stale
Projects
None yet
Development

No branches or pull requests

2 participants