Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Celery Executor is not working with redis-py 5.0.0 #33744

Closed
2 tasks done
alexbegg opened this issue Aug 25, 2023 · 11 comments · Fixed by #33773
Closed
2 tasks done

Celery Executor is not working with redis-py 5.0.0 #33744

alexbegg opened this issue Aug 25, 2023 · 11 comments · Fixed by #33773
Labels
area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet

Comments

@alexbegg
Copy link
Contributor

alexbegg commented Aug 25, 2023

Apache Airflow version

2.7.0

What happened

After upgrading to Airflow 2.7.0 in my local environment my Airflow DAGs won't run with Celery Executor using Redis even after changing celery_app_name configuration in celery section from airflow.executors.celery_executor to airflow.providers.celery.executors.celery_executor.

I see the error actually is unrelated to the recent Airflow Celery provider changes, but is related to Celery's Redis support. What is happening is Airflow fails to send jobs to the worker as the Kombu module is not compatible with Redis 5.0.0 (released last week). It gives this error (I will update this to the full traceback once I can reproduce this error one more time):

AttributeError: module 'redis' has no attribute 'client'

Celery actually is limiting redis-py to 4.x in an upcoming version of Celery 5.3.x (it is merged to main on August 17, 2023 but it is not yet released: celery/celery#8442 . The latest version is v5.3.1 released on June 18, 2023).

Kombu is also going to match Celery and limit redis-py to 4.x in an upcoming version as well (the PR is draft, I am assuming they are waiting for the Celery change to be released: celery/kombu#1776)

For now there is not really a way to fix this unless there is a way we can do a redis constraint to avoid 5.x. Or maybe once the next Celery 5.3.x release includes limiting redis-py to 4.x we can possibly limit Celery provider to that version of Celery?

What you think should happen instead

Airflow should be able to send jobs to workers when using Celery Executor with Redis

How to reproduce

  1. Start Airflow 2.7.0 with Celery Executor with Redis 5.0.0 installed by default (at the time of this writing)
  2. Run a DAG task
  3. The scheduler fails to send the job to the worker

Workaround:

  1. Limit redis-py to 4.x the same way the upcoming release of Celery 5.3.x does, by using this in requirements.txt: redis>=4.5.2,<5.0.0,!=4.5.5
  2. Start Airflow 2.7.0 with Celery Executor
  3. Run a DAG task
  4. The task runs successfully

Operating System

Debian GNU/Linux 11 (bullseye)

Versions of Apache Airflow Providers

apache-airflow-providers-celery==3.3.2

Deployment

Docker-Compose

Deployment details

I am using bitnami/airflow:2.7.0 image in Docker Compose when I first encountered this issue, but I will test with Breeze as well shortly and then update this issue.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@alexbegg alexbegg added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Aug 25, 2023
@alexbegg
Copy link
Contributor Author

I am not able to reproduce a 2nd time. Once I implemented the workaround I mentioned to downgrade to Python redis package 4.x (redis>=4.5.2,<5.0.0,!=4.5.5) in my environment, and the tasks start running again, when I take off that pip requirement and redis-py goes back to 5.0.0, the tasks now still continue to run. I am going to investigate further using Breeze to see what is happening.

@alexbegg
Copy link
Contributor Author

I tried with Breeze with

breeze start-airflow --use-airflow-version 2.7.0 --executor CeleryExecutor --airflow-extras celery,postgres --backend postgres

and I also can't reproduce the issue I was having.

It was a recurring issue after upgrading to 2.7.0 even after multiple restarts of my Docker Compose environment, and only got resolved after downgrading redis-py to 4.x.

The issue was strange because the error says module 'redis' has no attribute 'client', but I confirmed the installed redis python package at version 5.0.0 did still have a client module inside of it...

If nobody else can figure out what might have happened, maybe close this issue (since Celery package will soon restrict Redis-py to <5.0.0 anyways)

@potiuk
Copy link
Member

potiuk commented Aug 26, 2023

Thanks @alexbegg -> yes we should limit redis deps in this case . PR #33773 created. We will also cherry-pick it to 2.7.1

@potiuk potiuk assigned alexbegg and unassigned alexbegg Aug 26, 2023
potiuk added a commit to potiuk/airflow that referenced this issue Aug 26, 2023
Redis 5 relased last week breaks celery, celery is limiting it for
now and will resolve it later, we should similarly limit redis on
our side to limit redis for users who will not upgrade to celery
that will be released shortly.

Fixes: apache#33744
potiuk added a commit that referenced this issue Aug 26, 2023
Redis 5 relased last week breaks celery, celery is limiting it for
now and will resolve it later, we should similarly limit redis on
our side to limit redis for users who will not upgrade to celery
that will be released shortly.

Fixes: #33744
potiuk added a commit that referenced this issue Aug 26, 2023
Redis 5 relased last week breaks celery, celery is limiting it for
now and will resolve it later, we should similarly limit redis on
our side to limit redis for users who will not upgrade to celery
that will be released shortly.

Fixes: #33744
(cherry picked from commit 3ba994d)
@alexbegg
Copy link
Contributor Author

@potiuk just an FYI that Celery has now released v5.3.3 which includes the change to limit Redis client to 4.x, and Kombu's PR is now open and approved, so likely will be merged soon.

@rdjouder
Copy link

Hello,
I think the subject has been closed too early, redis 4.6.0 has this issue too

Python 3.8.6

Name: kombu
Version: 5.3.2

Name: apache-airflow
Version: 2.7.2

Name: redis
Version: 4.6.0

Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/celery/executors/celery_executor_utils.py", line 199, in send_task_to_executor
result = task_to_run.apply_async(args=[command], queue=queue)
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/task.py", line 594, in apply_async
return app.send_task(
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/base.py", line 794, in send_task
with self.producer_or_acquire(producer) as P:
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/base.py", line 929, in producer_or_acquire
producer, self.producer_pool.acquire, block=True,
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/base.py", line 1344, in producer_pool
return self.amqp.producer_pool
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/amqp.py", line 590, in producer_pool
self.app.connection_for_write()]
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/base.py", line 826, in connection_for_write
return self._connection(url or self.conf.broker_write_url, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/base.py", line 877, in _connection
return self.amqp.Connection(
File "/home/airflow/.local/lib/python3.8/site-packages/kombu/connection.py", line 201, in init
if not get_transport_cls(transport).can_parse_url:
File "/home/airflow/.local/lib/python3.8/site-packages/kombu/transport/init.py", line 90, in get_transport_cls
_transport_cache[transport] = resolve_transport(transport)
File "/home/airflow/.local/lib/python3.8/site-packages/kombu/transport/init.py", line 75, in resolve_transport
return symbol_by_name(transport)
File "/home/airflow/.local/lib/python3.8/site-packages/kombu/utils/imports.py", line 59, in symbol_by_name
module = imp(module_name, package=package, **kwargs)
File "/usr/local/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 783, in exec_module
File "", line 219, in _call_with_frames_removed
File "/home/airflow/.local/lib/python3.8/site-packages/kombu/transport/redis.py", line 281, in
class PrefixedRedisPipeline(GlobalKeyPrefixMixin, redis.client.Pipeline):
AttributeError: module 'redis' has no attribute 'client'

[airflow@xxxxx-scheduler-1 airflow]$ ls /home/airflow/.local/lib/python3.8/site-packages/redis
init.py pycache asyncio backoff.py client.py cluster.py commands compat.py connection.py crc.py credentials.py exceptions.py lock.py ocsp.py retry.py sentinel.py typing.py utils.py
[airflow@xxxx-scheduler-1 airflow]$ vi /home/airflow/.local/lib/python3.8/site-packages/kombu/transport/redis.py

@potiuk
Copy link
Member

potiuk commented Oct 30, 2023

I think you have something messed up with your environment. I recommend you to look more closely at it and if you need help open a new discussion or slack troubleshooting query with more details of your environment and your finding.

There is no problem with importing redis.client in 4.6.0

root@ad23900f4a58:/opt/airflow# pip freeze | grep redis
google-cloud-redis==2.13.2
redis==4.6.0
types-redis==4.6.0.8
root@ad23900f4a58:/opt/airflow# python
Python 3.8.18 (default, Oct 11 2023, 23:57:43) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import redis
>>> from redis import client
>>> 

Also maybe somewhere on PYTHONPATH you have another redis package that is automatically discovered by Python.

Anyhow - this is not the same issue - start a new discussion on it

@rdjouder
Copy link

rdjouder commented Nov 2, 2023

Thanks a lot for you answer.
I did check with a small py script calling symbol_by_name from kombu with same argument and it worked.
But the whole stack (Airflow 2.7.2) went OK only when I downgraded to redis 4.52.
Also import depends on python level.

@potiuk
Copy link
Member

potiuk commented Nov 2, 2023

Thanks a lot for you answer. I did check with a small py script calling symbol_by_name from kombu with same argument and it worked. But the whole stack (Airflow 2.7.2) went OK only when I downgraded to redis 4.52. Also import depends on python level.

That's why I mean - something is messed up in your env (but I have no idea why and only you can investigate it). Our CI and development environment already does "whole stack". Whern you install airflow in a given version you are supposed to use constraints - see the docs: https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html to achieve reproducible installation with "known good" set of dependencies. All those are tested, verified and they are known to work. And only that set is "guaranteed" to wrork. You can downgrade or upgrade dependencies within the requirements limit that airflow has (and in most cases you will be fine) - but then it might be that some dependenciex will cause some problems (it's impossible to test matrix of combinations of 670 dependencies).

In Airflow 2.7.2 constriants for example for Python 3.8, we know that Airflow works with those dependencies described in our constraints: https://github.com/apache/airflow/blob/constraints-2.7.2/constraints-3.8.txt

For example there:

  • celery==5.3.4
  • redis==4.6.0
  • kombu==5.3.2

I just run airfow 2.7.2. with those set of constraints and there is no problem - everything imports fine in Celery, everything works. You can check it yourself (and our CI does it automatically) by running development tool called breeze in main version of airflow source code:

breeze start-airflow --use-airflow-version 2.7.2 --executor CeleryExecutor

There - you have running airflow with celery and redis and everything works fine, no import problem.

That's why I think somehting is broken in your environment. What It is, hard to say, but I recommend you to start with the constraint list and compare your installed dependency versions with those on the constraint lits and work it out from there. Maybe some other dependency you have installed is breaking things. What we can do is provide you with "known good" constraints, and if you diverge from it, you have all the means to investigate what's causing it if you see problem like this (or you can stick to the constraints of our that are tested, verified and provide reproducible installation. Unless you have a reason to diverge from them - I recommend you to follow the installation process we have (as this is the only one that providese reproducible, working installation that we had a chance to test.

@ybryan
Copy link

ybryan commented Dec 6, 2023

This issue is also showing up in 2.7.3 but I'm using the docker image apache/airflow:2.7.3-python3.8. Interestingly enough, we have multiple Airflow envs but the problem is only manifesting in one of them

@potiuk
Copy link
Member

potiuk commented Dec 6, 2023

@ybryan. I suggest you open a discussion where you describe your problem with all the details. Separate one. Describe what you do and how you modify the image or what you do there. If you think you do not modify the image, double check it.

The apache/airflow:2.7.3-python3.8 image has both celery and redis in good version. Not sure what "this issue" is for you and how you trigger it but the original issue has this:

[jarek:~/code/airflow] [airflow-3.11] do-not-mount-sources-for-compatibility-check+ 7s 2 ± docker run -it apache/airflow:2.7.3-python3.8 bash

airflow@16ad5d130a4f:/opt/airflow$ pip freeze | grep redis
apache-airflow-providers-redis==3.4.0
google-cloud-redis==2.13.2
redis==4.6.0
airflow@16ad5d130a4f:/opt/airflow$ pip freeze | grep celery
apache-airflow-providers-celery==3.4.1
celery==5.3.4
airflow@16ad5d130a4f:/opt/airflow$

So - if you need help - please start a new discussion where you describe your circumstances and refer to this as "similar issue". Commenting on a closed issue is generally almost always bad idea because it is well, closed. Instaad creating new issue (or discussion if not sure) where you describe your problem has usually much better chance that something happens there.

@ybryan
Copy link

ybryan commented Dec 6, 2023

Thanks Jared. New discussion here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants