-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove columns
schema redundancy for external sources
#47
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Looks good, just a small comment on the README but ok to merge once that's changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Integration tests passing locally. I am working on another bug that appeared in a new release but hopefully can release this in the next few days. Thanks for the contribution @bokhi 🎉
Hey @bokhi, sorry for the delay. This has been released in v0.7.3. Thanks again! |
* Add support for INTERVAL column type * Add support for JSON column type * Add integration test for column types * Improve schema parsing from BigQuery client (autotraderuk#29) * Update release process * Release v0.6.3 * Support for dbt v1.4 (autotraderuk#30) * Upgrade to dbt 1.4 * Update CHANGES.md * Release v0.6.4 * Add --skip-not-compiled flag (autotraderuk#32) * Add --skip-not-compiled flag * Release v0.6.5 * Add --version parameter to print dry run version * Support python 3.10 * Update poetry.lock * Ignore more deprecation warnings * Ignore invalid escape sequences * Integration tests for incremental models * Support Python 3.11 (autotraderuk#37) Had to ignore deprecation warning of "cgi" core module being use by google-cloud-storage (transitive dependency) Integration and unit tests ran and are passing * Add extra-check-columns-metadata-key option (autotraderuk#36) * Release 0.6.6 Minor improvements and new CLI options * Support dbt 1.5 (autotraderuk#40) * Upgrade dbt to 1.5 and fix failing tests * target-path in project has been deprecated * Add --threads override option * Release v0.6.7 * Add compatibility with dbt 1.6rc1 * Update to 1.6.0 * Release 0.6.8 * --full-refresh and --target-path CLI flags support (autotraderuk#44) * add support for cli flag --full-refresh expose it as a global flag get predicted/model schema for full-refresh nodes * wire dbt --target-path cli flag allows integration tests to have multiple project contexts running at the same time without conflicting targets * add full_refresh support derived from dbt model spec as well * test full refresh precedence between cli flag and model config * verify and update readme * rename integration tests to make it clearer * refactor full refresh precedence to match dbt docs definition * update lock file and changes.md * Release 0.7.0 * Refactor model runner to split by materialization * Check incremental data types are compatible (autotraderuk#45) * Extra dry run to verify type compatibility * Refactor incremental runner unit tests * Struct integration test * Add changelog * Release v0.7.1 * Fix run-integration.sh writing to wrong target * Use column_types config for seeds (autotraderuk#46) * Use adapter to convert agate types for seeds * Print schema if node success when failure expected * Load `column_types` when dry running seeds * Add changelog * Release v0.7.2 * Remove `columns` schema redundancy for external sources (autotraderuk#47) * Respect existing column ordering for incremental models (autotraderuk#50) * Don't run merge if incremental has recursive CTES (autotraderuk#51) * Collate changes for 0.7.3 * Release v0.7.3 * fix false failure when require partition filter (autotraderuk#56) fix filtered_partition_date Co-authored-by: Maliek Borwin <maliek.borwin@autotrader.co.uk> * Changes for v0.7.4 * Release v0.7.4 * Fix problem where sql_header interacts with merge * Release v0.7.5 * merge origin dbt-dry-run updated code into migo dbt-dry-run * modify pyproject.toml pydantic dependency version to at least 1.10.8 --------- Co-authored-by: Philippa Main <philippa.main@autotrader.co.uk> Co-authored-by: connor-charles <75633736+connor-charles@users.noreply.github.com> Co-authored-by: Connor Charles <Connor.Charles@autotrader.co.uk> Co-authored-by: zachary-povey <64191599+zachary-povey@users.noreply.github.com> Co-authored-by: Connor Charles <ccharles.gb@gmail.com> Co-authored-by: Angelos Georgiadis <a2gelosgeo@gmail.com> Co-authored-by: Angelos Georgiadis <Angelos.Georgiadis@autotrader.co.uk> Co-authored-by: bokhi <martin.boissier@gmail.com> Co-authored-by: malik016 <63663632+malik9153@users.noreply.github.com> Co-authored-by: Maliek Borwin <maliek.borwin@autotrader.co.uk> Co-authored-by: bruce_huang <bruce_huang@migocorp.com>
Description
For some external sources such as Google Spreadsheets, the schema can be explicitely defined.
In the current implementation, if the
columns
schema is defined for an external source, it would also have to be duplicated as part of thedry_run_columns
section.We address the redundancy: for an external source if the
columns
schema is already defined, we simply reply on it instead of expecting it to also be defined indry_run_columns
Checklist:
make verify
and fixed any linting or test errorsmake integration
against a Big Query instance