Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenLineage Provider issue: scheduler shuts down after pickle of OpenLineageListener fails #40309

Closed
2 tasks done
merobi-hub opened this issue Jun 18, 2024 · 5 comments · Fixed by #40353 or #40402
Closed
2 tasks done
Assignees
Labels
area:core kind:bug This is a clearly a bug provider:openlineage AIP-53

Comments

@merobi-hub
Copy link
Contributor

merobi-hub commented Jun 18, 2024

Apache Airflow version

2.9.2

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When running Airflow locally with the latest version of the OpenLineage provider, the scheduler shuts down after the attempt to pickle OpenLineageListener initializer fails. Confirmed that this does not happen when using OpenLineage provider versions 1.5.0, 1.6.0 and 1.7.0.

What you think should happen instead?

scheduler  | [2024-06-18T13:57:05.429-0400] {scheduler_job_runner.py:860} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
scheduler  | Traceback (most recent call last):
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/airflow/jobs/scheduler_job_runner.py", line 843, in _execute
scheduler  | self._run_scheduler_loop()
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/airflow/jobs/scheduler_job_runner.py", line 975, in _run_scheduler_loop
scheduler  | num_queued_tis = self._do_scheduling(session)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/airflow/jobs/scheduler_job_runner.py", line 1051, in _do_scheduling
scheduler  | self._start_queued_dagruns(session)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/airflow/jobs/scheduler_job_runner.py", line 1391, in _start_queued_dagruns
scheduler  | dag_run.notify_dagrun_state_changed()
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/airflow/models/dagrun.py", line 980, in notify_dagrun_state_changed
scheduler  | get_listener_manager().hook.on_dag_run_running(dag_run=self, msg=msg)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/pluggy/_hooks.py", line 513, in __call__
scheduler  | return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/pluggy/_manager.py", line 120, in _hookexec
scheduler  | return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/pluggy/_callers.py", line 139, in _multicall
scheduler  | raise exception.with_traceback(exception.__traceback__)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/pluggy/_callers.py", line 103, in _multicall
scheduler  | res = hook_impl.function(*args)
scheduler  | File "/Users/michael/Library/Python/3.9/lib/python/site-packages/airflow/providers/openlineage/plugins/listener.py", line 323, in on_dag_run_running
scheduler  | self.executor.submit(
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/process.py", line 697, in submit
scheduler  | self._adjust_process_count()
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/concurrent/futures/process.py", line 675, in _adjust_process_count
scheduler  | p.start()
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/process.py", line 121, in start
scheduler  | self._popen = self._Popen(self)
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/context.py", line 284, in _Popen
scheduler  | return Popen(process_obj)
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 32, in __init__
scheduler  | super().__init__(process_obj)
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_fork.py", line 19, in __init__
scheduler  | self._launch(process_obj)
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
scheduler  | reduction.dump(process_obj, fp)
scheduler  | File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
scheduler  | ForkingPickler(file, protocol).dump(obj)
scheduler  | AttributeError: Can't pickle local object 'OpenLineageListener.executor.<locals>.initializer'

How to reproduce

Run airflow standalone using the latest versions of Airflow and the OpenLineage provider.

Operating System

MacOS Sonoma 14.5

Versions of Apache Airflow Providers

apache-airflow-providers-celery==3.7.2
apache-airflow-providers-common-io==1.3.2
apache-airflow-providers-common-sql==1.14.0
apache-airflow-providers-fab==1.1.1
apache-airflow-providers-ftp==3.9.1
apache-airflow-providers-google==10.19.0
apache-airflow-providers-http==4.11.1
apache-airflow-providers-imap==3.6.1
apache-airflow-providers-openlineage==1.8.0
apache-airflow-providers-smtp==1.7.1
apache-airflow-providers-sqlite==3.8.1

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@merobi-hub merobi-hub added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jun 18, 2024
@merobi-hub merobi-hub changed the title [Providers][OpenLineage] Scheduler shuts down after pickle of OpenLineageListener fails OpenLineage Provider issue: scheduler shuts down after pickle of OpenLineageListener fails Jun 18, 2024
@JDarDagran
Copy link
Contributor

In general airflow standalone doesn't really work with OpenLineage, even with earlier versions (at least that's what I recall).
More importantly, this is not reproducible in any environment other than airflow standalone - checked with breeze & set of executors, Google Composer and Astro Cloud.

@tatiana
Copy link
Contributor

tatiana commented Jun 18, 2024

@JDarDagran I understand there may be limitations with running OL in Airflow standalone, but we should do better than shutting down the scheduler (if not in 2.9, in future versions). Would it be a possibility to capture these errors and raise warnings?

@vatsrahul1001
Copy link
Collaborator

Totally agree with @tatiana maybe we should raise warning instead of scheduler getting crashed

@vatsrahul1001 vatsrahul1001 removed the needs-triage label for new issues that we didn't triage yet label Jun 19, 2024
@potiuk
Copy link
Member

potiuk commented Jun 19, 2024

Agree. "Can't pickle" error in standalone does not say much :)

@kacpermuda
Copy link
Contributor

Instead of a warning, maybe we can try avoiding the pickling error completely. For now the initializer is simply a single call to another function, so we can pass it directly and avoid using local functions #40353. Does it make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment