Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use common columns for incremental schema changes #4170

Merged
merged 4 commits into from
Nov 7, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,12 @@
### Under the hood
- Bump artifact schema versions for 1.0.0: manifest v4, run results v4, sources v3. Notable changes: schema test + data test nodes are renamed to generic test + singular test nodes; freshness threshold default values ([#4191](https://github.com/dbt-labs/dbt-core/pull/4191))
- Speed up node selection by skipping `incorporate_indirect_nodes` if not needed ([#4213](https://github.com/dbt-labs/dbt-core/issues/4213), [#4214](https://github.com/dbt-labs/dbt-core/issues/4214))
- [SF, BQ] When on_schema_change is set, pass common columns as dest_columns in incremental merge macros ([#4144](https://github.com/dbt-labs/dbt-core/issues/4144))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [SF, BQ] When on_schema_change is set, pass common columns as dest_columns in incremental merge macros ([#4144](https://github.com/dbt-labs/dbt-core/issues/4144))
- When on_schema_change is set, pass common columns as dest_columns in incremental merge macros ([#4144](https://github.com/dbt-labs/dbt-core/issues/4144), [#4170](https://github.com/dbt-labs/dbt-core/pull/4170))


Contributors:
- [@kadero](https://github.com/kadero) ([3955](https://github.com/dbt-labs/dbt-core/pull/3955))
- [@frankcash](https://github.com/frankcash) ([4136](https://github.com/dbt-labs/dbt-core/pull/4136))
- [@Kayrnt](https://github.com/Kayrnt) ([4136](https://github.com/dbt-labs/dbt-core/pull/4170))

## dbt-core 1.0.0b2 (October 25, 2021)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,15 @@

{% endmacro %}

{% macro diff_columns(source_columns, target_columns) %}
{% macro compare_columns(source_columns, target_columns, should_include) %}

{% set result = [] %}
{% set source_names = source_columns | map(attribute = 'column') | list %}
{% set target_names = target_columns | map(attribute = 'column') | list %}

{# --check whether the name attribute exists in the target - this does not perform a data type check #}
{% for sc in source_columns %}
{% if sc.name not in target_names %}
{% if (sc.name in target_names) == should_include %}
{{ result.append(sc) }}
{% endif %}
{% endfor %}
Expand All @@ -32,6 +32,14 @@

{% endmacro %}

{% macro diff_columns(source_columns, target_columns) %}
{{ return(compare_columns(source_columns, target_columns, false) ) }}
{% endmacro %}

{% macro intersect_columns(source_columns, target_columns) %}
{{ return(compare_columns(source_columns, target_columns, true) ) }}
{% endmacro %}

{% macro diff_column_data_types(source_columns, target_columns) %}

{% set result = [] %}
Expand All @@ -57,7 +65,8 @@
{%- set target_columns = adapter.get_columns_in_relation(target_relation) -%}
{%- set source_not_in_target = diff_columns(source_columns, target_columns) -%}
{%- set target_not_in_source = diff_columns(target_columns, source_columns) -%}

{%- set in_target_and_source = intersect_columns(target_columns, source_columns) -%}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add in source_columns here, as that's what we'll want to use for dest_columns after process_schema_changes has processed


{% set new_target_types = diff_column_data_types(source_columns, target_columns) %}

{% if source_not_in_target != [] %}
Expand All @@ -72,6 +81,8 @@
'schema_changed': schema_changed,
'source_not_in_target': source_not_in_target,
'target_not_in_source': target_not_in_source,
'in_target_and_source': in_target_and_source,
'target_columns': target_columns,
'new_target_types': new_target_types
} %}

Expand Down Expand Up @@ -132,7 +143,11 @@

{% macro process_schema_changes(on_schema_change, source_relation, target_relation) %}

{% if on_schema_change != 'ignore' %}
{% if on_schema_change == 'ignore' %}

{{ return({}) }}

{% else %}

{% set schema_changes_dict = check_for_schema_changes(source_relation, target_relation) %}

Expand All @@ -158,6 +173,8 @@
{% endif %}

{% endif %}

{{ return(schema_changes_dict) }}

{% endif %}

Expand Down