Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Dataplex Data Quality partial update #44262

Merged
merged 7 commits into from
Nov 22, 2024
Merged

Fix Dataplex Data Quality partial update #44262

merged 7 commits into from
Nov 22, 2024

Conversation

amirmor1
Copy link
Contributor

Fix a failure when trying to do a partial update of Dataplex Data Quality Task and always getting AirflowException because mandatory fields are missing.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

amirmor1 and others added 5 commits November 14, 2024 16:18
When we try to update dataplex data quality task using the DataplexCreateOrUpdateDataQualityScanOperator, it will first try to create the task, and only if it fails with AlreadyExists exception, it will try to update the task, but if you want to provide a partial parameters to the update (and not to replace the entire data scan properties), it will fail with AirflowException `Error creating Data Quality scan` because its missing mandatory parameters in the DataScan, and will never update the task.

I've added a check to see if update_mask is not None, first try to do this update, and only if not -> try to create the task.
Also moved the update section into a private function to reuse it this check, and later if we are trying to do a full update of the task
@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Nov 21, 2024
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@potiuk potiuk merged commit b22e3c1 into apache:main Nov 22, 2024
62 checks passed
LefterisXefteris pushed a commit to LefterisXefteris/airflow that referenced this pull request Jan 5, 2025
* 44012 - Update index.rst

* Fix Dataplex Data Quality Task partial update

When we try to update dataplex data quality task using the DataplexCreateOrUpdateDataQualityScanOperator, it will first try to create the task, and only if it fails with AlreadyExists exception, it will try to update the task, but if you want to provide a partial parameters to the update (and not to replace the entire data scan properties), it will fail with AirflowException `Error creating Data Quality scan` because its missing mandatory parameters in the DataScan, and will never update the task.

I've added a check to see if update_mask is not None, first try to do this update, and only if not -> try to create the task.
Also moved the update section into a private function to reuse it this check, and later if we are trying to do a full update of the task

* add empty line for lint

* add test to verify update when update_mask is not none

---------

Co-authored-by: Amir Mor <amir.mor26@gmail.com>
got686-yandex pushed a commit to got686-yandex/airflow that referenced this pull request Jan 30, 2025
* 44012 - Update index.rst

* Fix Dataplex Data Quality Task partial update

When we try to update dataplex data quality task using the DataplexCreateOrUpdateDataQualityScanOperator, it will first try to create the task, and only if it fails with AlreadyExists exception, it will try to update the task, but if you want to provide a partial parameters to the update (and not to replace the entire data scan properties), it will fail with AirflowException `Error creating Data Quality scan` because its missing mandatory parameters in the DataScan, and will never update the task.

I've added a check to see if update_mask is not None, first try to do this update, and only if not -> try to create the task.
Also moved the update section into a private function to reuse it this check, and later if we are trying to do a full update of the task

* add empty line for lint

* add test to verify update when update_mask is not none

---------

Co-authored-by: Amir Mor <amir.mor26@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants