Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid argument (full_refresh) passed to DbtTestAwsEksOperator (and others) #1175

Merged
merged 6 commits into from
Sep 21, 2024

Conversation

johnhoran
Copy link
Contributor

@johnhoran johnhoran commented Aug 23, 2024

#590 added a fix to consume the kwargs full_refresh_ignore if it wasn't consumed by a higher class as it was preventing the use of test in the DbtTaskGroup if full_refresh_ignore was set. The previous patch fixed this by consuming the variable for the DbtLocalBaseOperator, leaving a bug in kubernetes and docker operator. Since AbstractDbtBaseOperator has been added as a base of DbtDockerBaseOperator, DbtKubernetesBaseOperator and DbtLocalBaseOperator, moving the code there will fix all three.

Fixes #1062

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Aug 23, 2024
Copy link

netlify bot commented Aug 23, 2024

Deploy Preview for sunny-pastelito-5ecb04 canceled.

Name Link
🔨 Latest commit f5d2708
🔍 Latest deploy log https://app.netlify.com/sites/sunny-pastelito-5ecb04/deploys/66ee97b2a4da0c00085f7ae1

@dosubot dosubot bot added the execution:local Related to Local execution environment label Aug 23, 2024
@johnhoran johnhoran changed the title move param consumer Move full_refresh_ignore param consumer Aug 23, 2024
@johnhoran
Copy link
Contributor Author

johnhoran commented Aug 23, 2024

Fixes #1062

Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnhoran thank you very much for making this contribution!

Could you add a unittest that covers that scenario that stopped working - so we make sure we don't have any regressions in this area in future? Perhaps it could be on an EKS or K8s operator, since that's where the issue was first spotted.

Also, could you please update the PR description to include "Fixes #1062" or "Closes #1062", so when we merge the PR the related issue is closed automagically?

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Aug 23, 2024
@johnhoran
Copy link
Contributor Author

@tatiana sure, I've added a test. My strategy there was to have an exception thrown after the kubernetes manifest is created and then examine the manifest to check the correct arguments are being passed to DBT.

@pankajkoti
Copy link
Contributor

hi @johnhoran could you please rebase the PR. We just merged #1182 that should fix the breaking CI jobs.

@johnhoran
Copy link
Contributor Author

Thanks @pankajkoti

Copy link

codecov bot commented Aug 30, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.95%. Comparing base (948c3fc) to head (f5d2708).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1175   +/-   ##
=======================================
  Coverage   95.95%   95.95%           
=======================================
  Files          64       64           
  Lines        3656     3656           
=======================================
  Hits         3508     3508           
  Misses        148      148           
Flag Coverage Δ
95.95% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tatiana tatiana changed the title Move full_refresh_ignore param consumer Fix "invalid arguments were passed to DbtTestAwsEksOperator" error (full_refresh) Sep 21, 2024
@tatiana tatiana changed the title Fix "invalid arguments were passed to DbtTestAwsEksOperator" error (full_refresh) Fix invalid argument (full_refresh) passed to DbtTestAwsEksOperator" Sep 21, 2024
@tatiana tatiana changed the title Fix invalid argument (full_refresh) passed to DbtTestAwsEksOperator" Fix invalid argument (full_refresh) passed to DbtTestAwsEksOperator (and others) Sep 21, 2024
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, @johnhoran , thank you very much for reporting and fixing the issue! We'll release this as part of Cosmos 1.7

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 21, 2024
@tatiana tatiana added this to the Cosmos 1.7.0 milestone Sep 21, 2024
@tatiana tatiana merged commit 11de5ba into astronomer:main Sep 21, 2024
65 checks passed
ags-de pushed a commit to ags-de/astronomer-cosmos that referenced this pull request Sep 24, 2024
… (and others) (astronomer#1175)

astronomer#590 added a fix to
consume the kwargs `full_refresh_ignore` if it wasn't consumed by a
higher class as it was preventing the use of test in the DbtTaskGroup if
`full_refresh_ignore` was set. The previous patch fixed this by
consuming the variable for the `DbtLocalBaseOperator`, leaving a bug in
kubernetes and docker operator. Since `AbstractDbtBaseOperator` has been
added as a base of `DbtDockerBaseOperator`, `DbtKubernetesBaseOperator`
and `DbtLocalBaseOperator`, moving the code there will fix all three.

Fixes astronomer#1062

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
@tatiana tatiana mentioned this pull request Oct 2, 2024
@johnhoran johnhoran deleted the full_refresh_ignore branch October 2, 2024 13:45
tatiana added a commit that referenced this pull request Oct 4, 2024
New Features

* Introduction of experimental support to run dbt BQ models using Airflow deferrable operators by @pankajkoti @pankajastro @tatiana in #1224 #1230.
  This is a first step in this journey and we would really appreciate feedback from the community.

  For more information, check the documentation: https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#airflow-async-experimental

  This work has been inspired by the talk "Airflow at Monzo: Evolving our data platform as the bank scales" by
  @jonathanrainer @ed-sparkes given at Airflow Summit 2023: https://airflowsummit.org/sessions/2023/airflow-at-monzo-evolving-our-data-platform-as-the-bank-scales/.

* Support using ``DatasetAlias`` and fix orphaning unreferenced dataset by @tatiana in #1217 #1240

  Documentation: https://astronomer.github.io/astronomer-cosmos/configuration/scheduling.html#data-aware-scheduling

* Add GCP_CLOUD_RUN_JOB execution mode by @ags-de #1153

  Learn more about it: https://astronomer.github.io/astronomer-cosmos/getting_started/gcp-cloud-run-job.html

Enhancements

* Create single virtualenv when ``DbtVirtualenvBaseOperator`` has ``virtualenv_dir=None`` and ``is_virtualenv_dir_temporary=True`` by @kesompochy in #1200
* Consistently handle build and imports in ``cosmos/__init__.py`` by @tatiana in #1215
* Add enum constants to init for direct import by @fabiomx in #1184

Bug fixes

* URL encode dataset names to support multibyte characters by @t0momi219 in #1198
* Fix invalid argument (``full_refresh``) passed to DbtTestAwsEksOperator (and others) by @johnhoran in #1175
* Fix ``printer_width`` arg type in ``DbtProfileConfigVars`` by @jessicaschueler in #1191
* Fix task owner fallback by @jmaicher in #1195

Docs

* Add scarf to readme and docs for website analytics by @cmarteepants in #1221
* Add ``virtualenv_dir`` param to ``ExecutionConfig`` docs by @pankajkoti in #1173
* Give credits to @LennartKloppenburg in CHANGELOG.rst by @tatiana #1174
* Refactor docs for async mode execution by @pankajkoti in #1241

Others

* Remove PR branch added for testing a change in CI in #1224 by @pankajkoti in #1233
* Fix CI wrt broken coverage upload artifact @pankajkoti in #1210
* Fix CI issues - Upgrade actions/upload-artifact & actions/download-artifact to v4 and set min version for packaging by @pankajkoti in #1208
* Resolve CI failures for Apache Airflow 2.7 jobs by @pankajkoti in #1182
* CI: Update GCP manifest file path based on new secret update by @pankajkoti in #1237
* Pre-commit hook updates in #1176 #1186, #1186, #1201, #1219, #1231
tatiana added a commit that referenced this pull request Oct 4, 2024
New Features

* Introduction of experimental support to run dbt BQ models using Airflow deferrable operators by @pankajkoti @pankajastro @tatiana in #1224 #1230.
  This is a first step in this journey and we would really appreciate feedback from the community.

  For more information, check the documentation: https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#airflow-async-experimental

  This work has been inspired by the talk "Airflow at Monzo: Evolving our data platform as the bank scales" by
  @jonathanrainer @ed-sparkes given at Airflow Summit 2023: https://airflowsummit.org/sessions/2023/airflow-at-monzo-evolving-our-data-platform-as-the-bank-scales/.

* Support using ``DatasetAlias`` and fix orphaning unreferenced dataset by @tatiana in #1217 #1240

  Documentation: https://astronomer.github.io/astronomer-cosmos/configuration/scheduling.html#data-aware-scheduling

* Add GCP_CLOUD_RUN_JOB execution mode by @ags-de #1153

  Learn more about it: https://astronomer.github.io/astronomer-cosmos/getting_started/gcp-cloud-run-job.html

Enhancements

* Create single virtualenv when ``DbtVirtualenvBaseOperator`` has ``virtualenv_dir=None`` and ``is_virtualenv_dir_temporary=True`` by @kesompochy in #1200
* Consistently handle build and imports in ``cosmos/__init__.py`` by @tatiana in #1215
* Add enum constants to init for direct import by @fabiomx in #1184

Bug fixes

* URL encode dataset names to support multibyte characters by @t0momi219 in #1198
* Fix invalid argument (``full_refresh``) passed to DbtTestAwsEksOperator (and others) by @johnhoran in #1175
* Fix ``printer_width`` arg type in ``DbtProfileConfigVars`` by @jessicaschueler in #1191
* Fix task owner fallback by @jmaicher in #1195

Docs

* Add scarf to readme and docs for website analytics by @cmarteepants in #1221
* Add ``virtualenv_dir`` param to ``ExecutionConfig`` docs by @pankajkoti in #1173
* Give credits to @LennartKloppenburg in CHANGELOG.rst by @tatiana #1174
* Refactor docs for async mode execution by @pankajkoti in #1241

Others

* Remove PR branch added for testing a change in CI in #1224 by @pankajkoti in #1233
* Fix CI wrt broken coverage upload artifact @pankajkoti in #1210
* Fix CI issues - Upgrade actions/upload-artifact & actions/download-artifact to v4 and set min version for packaging by @pankajkoti in #1208
* Resolve CI failures for Apache Airflow 2.7 jobs by @pankajkoti in #1182
* CI: Update GCP manifest file path based on new secret update by @pankajkoti in #1237
* Pre-commit hook updates in #1176 #1186, #1186, #1201, #1219, #1231
tatiana added a commit that referenced this pull request Oct 4, 2024
**New Features**

* Support using ``DatasetAlias`` and fix orphaning unreferenced dataset
by @tatiana in #1217 #1240

Documentation:
https://astronomer.github.io/astronomer-cosmos/configuration/scheduling.html#data-aware-scheduling

* Add GCP_CLOUD_RUN_JOB execution mode by @ags-de #1153

Learn more about it:
https://astronomer.github.io/astronomer-cosmos/getting_started/gcp-cloud-run-job.html

* Introduction of experimental support to run dbt BQ models using
Airflow deferrable operators by @pankajkoti @pankajastro @tatiana in
#1224 #1230.

This is the first step in the journey of running dbt resources with
native Airflow, and we would appreciate feedback from the community.

For more information, check the documentation:
https://astronomer.github.io/astronomer-cosmos/getting_started/execution-modes.html#airflow-async-experimental

This work has been inspired by the talk "Airflow at Monzo: Evolving our
data platform as the bank scales" by
@jonathanrainer @ed-sparkes given at Airflow Summit 2023:
https://airflowsummit.org/sessions/2023/airflow-at-monzo-evolving-our-data-platform-as-the-bank-scales/.


**Enhancements**

* Create single virtualenv when ``DbtVirtualenvBaseOperator`` has
``virtualenv_dir=None`` and ``is_virtualenv_dir_temporary=True`` by
@kesompochy in #1200
* Consistently handle build and imports in ``cosmos/__init__.py`` by
@tatiana in #1215
* Add enum constants to init for direct import by @fabiomx in #1184

**Bug fixes**

* URL encode dataset names to support multibyte characters by @t0momi219
in #1198
* Fix invalid argument (``full_refresh``) passed to
DbtTestAwsEksOperator (and others) by @johnhoran in #1175
* Fix ``printer_width`` arg type in ``DbtProfileConfigVars`` by
@jessicaschueler in #1191
* Fix task owner fallback by @jmaicher in #1195

**Docs**

* Add scarf to readme and docs for website analytics by @cmarteepants in
#1221
* Add ``virtualenv_dir`` param to ``ExecutionConfig`` docs by
@pankajkoti in #1173
* Give credits to @LennartKloppenburg in CHANGELOG.rst by @tatiana #1174
* Refactor docs for async mode execution by @pankajkoti in #1241

Others

* Remove PR branch added for testing a change in CI in #1224 by
@pankajkoti in #1233
* Fix CI wrt broken coverage upload artifact @pankajkoti in #1210
* Fix CI issues - Upgrade actions/upload-artifact &
actions/download-artifact to v4 and set min version for packaging by
@pankajkoti in #1208
* Resolve CI failures for Apache Airflow 2.7 jobs by @pankajkoti in
#1182
* CI: Update GCP manifest file path based on new secret update by
@pankajkoti in #1237
* Pre-commit hook updates in #1176 #1186, #1186, #1201, #1219, #1231

---------

Co-authored-by: Pankaj Koti <pankajkoti699@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
execution:local Related to Local execution environment lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Test task doesn't accept full_refresh kwarg
3 participants