-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ExternalTaskSensor doesn't timeout if external DAG doesn't exist - dag.test() #34497
Comments
Some further testing shows this doesn't seem to affect the |
@cbuffett Can you please share your testing script? is it something like the below script?
|
@utkarsharma2 Yes, that script should easily reproduce the infinite rescheduling. |
@utkarsharma2 did you reproduce the issue? if so, could you remove the |
@cbuffett I'm not able to reproduce this locally with the latest airflow code, I ran below script:
can you please specify which airflow version you are on? |
I'm on Airflow 2.6.1. I ran your test script, but it had some errors due to value being passed to
|
@cbuffett Thanks for the response, now I'm able to reproduce this on the latest airflow main branch and 2.6.1 as well. I'll look into it and get back. |
@cbuffett The issue is fixed in the airflow 2.7.1 by PR - #33401. There was another issue in my script which led me to think the issue still persists. The below script should work without halting the debugging flow on airflow's 2.7.1.
|
Unfortunately I'm still seeing the infinite rescheduling issue after upgrading to Airflow 2.7.2. I believe the issue still revolves around the fact that the following code in task_reschedules = TaskReschedule.find_for_task_instance(
context["ti"], try_number=first_try_number
) # <-- This call is returning an empty list
if not task_reschedules:
start_date = timezone.utcnow()
else:
start_date = task_reschedules[0].start_date Because this list is empty, if run_duration() > self.timeout:
# If sensor is in soft fail mode but times out raise AirflowSkipException.
message = (
f"Sensor has timed out; run duration of {run_duration()} seconds exceeds "
f"the specified timeout of {self.timeout}."
)
if self.soft_fail:
raise AirflowSkipException(message)
else:
raise AirflowSensorTimeout(message)
``` |
Did some more testing and the infinite rescheduling doesn't happen if I set |
This issue has been automatically marked as stale because it has been open for 365 days without any activity. There has been several Airflow releases since last activity on this issue. Kindly asking to recheck the report against latest Airflow version and let us know if the issue is reproducible. The issue will be closed in next 30 days if no further activity occurs from the issue author. |
This issue has been closed because it has not received response from the issue author. |
Apache Airflow version
Other Airflow 2 version (please specify below)
What happened
My DAG has a number of tasks, the first of which is an ExternalTaskSensor. This sensor functions correctly when the external DAG exists (normal operation/deployment). However, when using
dag.test()
to debug the DAG, the ExternalTaskSensor never terminates, rescheduling itself indefinitely. I believe this happens because in this situation, the external DAG doesn't exist.Using
check_existence
isn't an option as this immediately throws an exception and terminates the debugger. Usingsoft_fail
and/orsilent_fail
result in the exception being logged instead of thrown, but the ExternalTaskSensor continues to reschedule itself.After some debugging, what I noticed is that the
start_date
keeps being reset to the current time, becausetask_reschedules
is always emptyWhat you think should happen instead
A way to ignore/skip ExternalTaskSensors when using dag.test(). At the very least, the ExternalTaskSensor should respect the timeout value provided.
How to reproduce
Running a DAG with the following ExternalTaskSensor using
dag.test()
Operating System
Ubuntu 22.04
Versions of Apache Airflow Providers
Deployment
Other
Deployment details
No response
Anything else
Log entry showing the DAG continuing to reschedule itself well past the timeout period
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: