You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The equality test between seed and incremental_relation has been passing because we didn't have the right quoting character defined for Spark. " is the default quoting character in dbt-core; in Spark, " encloses a string literal, not a column name.
Therefore, a query like
-- setup
with a as (
select*fromdbt_jcohen.incremental_relation
),
b as (
select*fromdbt_jcohen.seed
),
a_minus_b as (
select"first_name", "last_name", "email", "gender", "ip_address", "id", "# Partition Information", "# col_name", "id"from a
except
select"first_name", "last_name", "email", "gender", "ip_address", "id", "# Partition Information", "# col_name", "id"from b
),
b_minus_a as (
select"first_name", "last_name", "email", "gender", "ip_address", "id", "# Partition Information", "# col_name", "id"from b
except
select"first_name", "last_name", "email", "gender", "ip_address", "id", "# Partition Information", "# col_name", "id"from a
),
unioned as (
select*from a_minus_b
union allselect*from b_minus_a
),
final as (
select (selectcount(*) from unioned) +
(select abs(
(selectcount(*) from a_minus_b) -
(selectcount(*) from b_minus_a)
))
as count
)
select count from final
Looks okay prima facie. There's some metadata/comment column names included, which is weirdly not erroring. I thought to run just the snippet
Column order matters
In Spark, tables store their partition columns last. In the scenario featured in our integration test, given a seed file
seed
And an incremental model
The resulting table will look like
In subsequent incremental runs, dbt would attempt to run two queries
Since the columns in
seed
are in different order from the columns inincremental_relation
(partitioned onid
), the result would beWhy hasn't the integration test been failing?
The
equality
test betweenseed
andincremental_relation
has been passing because we didn't have the right quoting character defined for Spark."
is the default quoting character indbt-core
; in Spark,"
encloses a string literal, not a column name.Therefore, a query like
Looks okay prima facie. There's some metadata/comment column names included, which is weirdly not erroring. I thought to run just the snippet
Which returns
Yeah.
Solutions
`
instead of"
(handled in Pull the owner from the DESCRIBE EXTENDED #39)The text was updated successfully, but these errors were encountered: