-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenLineage] fix: Fix parent id macro and remove unused utils #37877
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LGTM - if this PR extends mine shall I proceed with closing the other one? |
This one is for the provider package, and yours is for |
blacklight
added a commit
to blacklight/OpenLineage
that referenced
this pull request
Mar 5, 2024
- Both `lineage_run_id` and `lineage_parent_id` should expose the same interface - only a `TaskInstance` object is now required as argument. - Import `_DAG_NAMESPACE` instead of inferring it again. Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com>
mobuchowski
approved these changes
Mar 5, 2024
blacklight
added a commit
to blacklight/OpenLineage
that referenced
this pull request
Apr 4, 2024
- Both `lineage_run_id` and `lineage_parent_id` should expose the same interface - only a `TaskInstance` object is now required as argument. - Import `_DAG_NAMESPACE` instead of inferring it again. Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com> Signed-off-by: Fabio Manganiello <fabio@manganiello.tech>
blacklight
added a commit
to blacklight/OpenLineage
that referenced
this pull request
Apr 4, 2024
…arent_id`. - Returned format: `<namespace>/<name>/<run_id>`. - `name` should be `<dag_id>.<task_id>`, not a UUID. - `run_id` should be a UUID, not `<run_timestamp>.<try_number>`. - Both `lineage_run_id` and `lineage_parent_id` should expose the same interface - only a `TaskInstance` object is now required as argument. - Import `_DAG_NAMESPACE` instead of inferring it again. Airflow-Reference: apache/airflow#37877 Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com>
blacklight
added a commit
to blacklight/OpenLineage
that referenced
this pull request
Apr 4, 2024
…arent_id`. - Returned format: `<namespace>/<name>/<run_id>`. - `name` should be `<dag_id>.<task_id>`, not a UUID. - `run_id` should be a UUID, not `<run_timestamp>.<try_number>`. - Both `lineage_run_id` and `lineage_parent_id` should expose the same interface - only a `TaskInstance` object is now required as argument. - Import `_DAG_NAMESPACE` instead of inferring it again. Airflow reference: apache/airflow#37877 Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com>
blacklight
added a commit
to blacklight/OpenLineage
that referenced
this pull request
Apr 4, 2024
…arent_id`. - Returned format: `<namespace>/<name>/<run_id>`. - `name` should be `<dag_id>.<task_id>`, not a UUID. - `run_id` should be a UUID, not `<run_timestamp>.<try_number>`. - Both `lineage_run_id` and `lineage_parent_id` should expose the same interface - only a `TaskInstance` object is now required as argument. - Import `_DAG_NAMESPACE` instead of inferring it again. Airflow reference: apache/airflow#37877 Signed-off-by: Fabio Manganiello <fabio@manganiello.tech>
kacpermuda
pushed a commit
to OpenLineage/OpenLineage
that referenced
this pull request
Apr 5, 2024
#2578) * [#2488] Fixed format returned by `airflow.macros.lineage_parent_id`. - Returned format: `<namespace>/<name>/<run_id>`. - `name` should be `<dag_id>.<task_id>`, not a UUID. - `run_id` should be a UUID, not `<run_timestamp>.<try_number>`. - Both `lineage_run_id` and `lineage_parent_id` should expose the same interface - only a `TaskInstance` object is now required as argument. - Import `_DAG_NAMESPACE` instead of inferring it again. Airflow reference: apache/airflow#37877 Signed-off-by: Fabio Manganiello <fabio@manganiello.tech> * Fixed redefinition of `get_unknown_source_attribute_run_facet` introduced upon merge. Signed-off-by: Fabio Manganiello <fabio@manganiello.tech> * Addressed #2578 (comment) Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com> * Fixed failing macro test Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com> --------- Signed-off-by: Fabio Manganiello <fabio@manganiello.tech> Signed-off-by: Fabio Manganiello <fabio.manganiello@booking.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Source of this PR is here, thanks @blacklight.
When using lineage_parent_id, a user should receive all the information needed to create ParentRunFacet. Now we have this as
lineage_parent_id
function:airflow/airflow/providers/openlineage/plugins/macros.py
Lines 60 to 66 in 30f7b2a
So as a job name, we get the UUID instead of
dag_id.task_id
. After the change, we will be able to easily create ParentRunFacet:Also, we are extracting the namespace again in macros module, when we have one already extracted in adapter module, and this one is used in events so we should use the same one here.
Important
I am not sure if removing the
run_id
from the macro is a breaking change. We could probably leave it there, but there is no real use in it i think.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.