Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove logical_date from DAG Run APIs and Functions, transition to run_id as sole identifier for Airflow 3.0 #42404

Merged
merged 7 commits into from
Nov 20, 2024

Conversation

sunank200
Copy link
Collaborator

@sunank200 sunank200 commented Sep 23, 2024

After renaming execuiton_date to logical_date in 43902 this PR removes the logical_date arguments from functions and APIs that are used to retrieve DAG runs, aligning with the broader changes introduced in Airflow 2.2 and preparing for Airflow 3.0. The functions now use run_id as the sole identifier for DAG runs, simplifying the process and eliminating deprecated behaviour.

Motivation:

In Airflow, execution_date has historically been used to distinguish different DAG run instances. However, the introduction of run_id and the DAG run concept in Airflow 2.2 shifts away from using execution_date as an identifier. Continuing to rely on execution_date introduces limitations, such as the inability to handle multiple DAG runs at the same logical time, especially in cases like TriggerDagRunOperator when dynamic runs are generated.

This PR eliminates these limitations by removing execution_dateand logical_date in favor of run_id.

Key Changes:

  1. API and Function Changes:

    • The logical_date arguments have been removed from all public APIs and Python functions related to DAG run lookups.
    • run_id is now the exclusive identifier for DAG runs in these contexts.
  2. Database Migration:

    • The unique constraint on execution_date in the database has been dropped, as run_id now ensures the uniqueness of DAG runs as part of #41818

Rationale:

Removing execution_date is necessary to enable more flexible DAG run management. For example, dynamic runs created by TriggerDagRunOperator can now be correctly identified and managed without awkward workarounds as discussed in this doc. This change makes subsequent DAG run lookups easier and more robust, while also simplifying the database schema by removing the unique constraint on execution_date.

How execution_date and logical_date Work

  1. Logical date is equivalent to execution date: The two are just different names for the same value.
  2. Timetable controls logical date: The logical date can be set to any value, not necessarily tied to the data interval's start or end.
  3. Schedules dictate behavior: For value-based schedules (like cron), the logical date is set by the timetable class used.

Additionally, users will still be able to view execution_date for reference, renamed as logical_date, and paired with run_id for clarity in the web UI, making it easier to distinguish between DAG runs.

Testing

  • Updated unit tests to reflect the changes.

closes: #42339, #42340 and #42338


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:CLI area:db-migrations PRs with DB migration area:providers area:Scheduler including HA (high availability) scheduler area:Triggerer area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues provider:cncf-kubernetes Kubernetes provider related issues provider:databricks labels Sep 23, 2024
@sunank200 sunank200 force-pushed the rename-execution-date branch 15 times, most recently from 3e729bc to efa47f6 Compare September 30, 2024 13:30
@sunank200 sunank200 force-pushed the rename-execution-date branch from 9212ef8 to 24a1673 Compare November 19, 2024 19:44
@sunank200 sunank200 changed the title Remove execution_date and logical_date from DAG Run APIs and Functions, transition to run_id as sole identifier for Airflow 3.0 Remove logical_date from DAG Run APIs and Functions, transition to run_id as sole identifier for Airflow 3.0 Nov 19, 2024
@sunank200 sunank200 force-pushed the rename-execution-date branch 5 times, most recently from 14dac57 to 9f3adfc Compare November 20, 2024 11:46
sunank200 and others added 6 commits November 20, 2024 17:31
…API is used to look up a DAG run

Remove execution_date and logical_date from arguments where function/API is used to look up a DAG run

- Resolve compatibility issues and refactor `execution_date` to `logical_date`
- Resolve compatibility tests
- Correct import paths after rebase
- Address static checks
- Refactor `execution_date` to `logical_date`
- Add missing DAG files for tests
- Enhance GCP ML and CloudBuild tests using helpers for compatibility
- Fix mypy errors
- Miscellaneous fixes and removals of `execution_date`
- Resolve compatibility tests
- Correct import paths after rebase
- Address static checks
- Refactor `execution_date` to `logical_date`
- Add missing DAG files for tests
- Enhance GCP ML and CloudBuild tests using helpers for compatibility
- Fix mypy errors
- Miscellaneous fixes and removals of `execution_date`
- Resolve compatibility tests
- Correct import paths after rebase
- Address static checks
- Refactor `execution_date` to `logical_date`
- Add missing DAG files for tests
- Enhance GCP ML and CloudBuild tests using helpers for compatibility
- Fix mypy errors
- Miscellaneous fixes and removals of `execution_date`
- Drop execution_date unique constraint on DagRun
- The column has also been renamed to logical_date, although the Python
model is not changed. This allows us to not need to fix all the Python
code at once (we'll do that later), but still do the two changes in one
migration instead of two.
- Use test helpers in GCP MLEngine tests for compat
- Using Airflow internals directly presents a problem when dealing with
compatibility in tests (since the same tests must run against Airflow 2
and 3). The helpers already handle this well, so we should use them.
- Use test helpers in GCP CloudBuild tests for compat
- Using Airflow internals directly presents a problem when dealing with
compatibility in tests (since the same tests must run against Airflow 2
and 3). The helpers already handle this well, so we should use them.
- Mark db tests
- Some compat code to make DAG.clear() still work
- Remove unneeded test cases for compat code
- Use test helpers in GCS-BQ tests for compat
- Using Airflow internals directly presents a problem when dealing with
compatibility in tests (since the same tests must run against Airflow 2
and 3). The helpers already handle this well, so we should use them.
Fix tests and schema
@sunank200 sunank200 force-pushed the rename-execution-date branch from 9f3adfc to 76b21d2 Compare November 20, 2024 11:46
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming green

@uranusjr uranusjr merged commit aa7a3b2 into apache:main Nov 20, 2024
62 checks passed
@uranusjr uranusjr deleted the rename-execution-date branch November 20, 2024 14:01
kandharvishnu pushed a commit to kandharvishnu/airflow that referenced this pull request Nov 20, 2024
@rawwar
Copy link
Collaborator

rawwar commented Nov 20, 2024

@sunank200 , will there be another PR to remove logical_date from Fastapi endpoints as well?

@sunank200
Copy link
Collaborator Author

@sunank200 , will there be another PR to remove logical_date from Fastapi endpoints as well?

@rawwar I think logical_date would be there to filter as part of FastAPI endpoints. But we solely identify DagRun by run_id.

@uranusjr
Copy link
Member

We should remove it if the frontend can work without them.

@sunank200 sunank200 added AIP-83 Remove Execution Date Unique Constraint from DAG Run airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes labels Nov 25, 2024
LefterisXefteris pushed a commit to LefterisXefteris/airflow that referenced this pull request Jan 5, 2025
got686-yandex pushed a commit to got686-yandex/airflow that referenced this pull request Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-83 Remove Execution Date Unique Constraint from DAG Run airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes area:API Airflow's REST/HTTP API area:CLI area:db-migrations PRs with DB migration area:providers area:Scheduler including HA (high availability) scheduler area:Triggerer area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues legacy api Whether legacy API changes should be allowed in PR legacy ui Whether legacy UI change should be allowed in PR provider:cncf-kubernetes Kubernetes provider related issues provider:databricks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove execution_date from API, CLI, web UI return values
5 participants