-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Redis] Connection not released after DNS issue #36853
Comments
/cc @cescoffier (redis), @gsmet (redis), @machi1990 (redis) |
@Ladicek didn't you fix the connection pool recently? |
I fixed the Redis clustered client to actually use a connection pool instead of creating new connections all the time. This seems different. @lucaspouzac Do you have any indication of what the error might be? Otherwise it's gonna be a blind search. |
@Ladicek 2 seconds before, there is a thread blocked on an http call. Maybe related?
The native compilation is used with mandrel 23.1. |
The stack trace comes from a slow DNS resolution. Do you have the same in JVM mode? |
I have not tried to deploy this application in JVM mode. For other spring boot applications in JVM mode, in the same k8s cluster and same namespace, it does not seem to be a problem observed. We have another quarkus application in native compilation which does not seem to have any problems. This has not yet been migrated to quarkus 3 (still in 2.8). No redis, only http calls |
After checking, I found some DNS resolution problems in the cluster. These are sporadic problems, because we have very few logs observed over 1 week. These cases do not cause visible problems because a retry is performed correctly and connection is correctly released. I suspect that in certain cases of connection errors in Redis, the connection is not correctly released on the connection pool side. |
We may not release the connection after a DNS failure (that would obviously be a bug). |
We got another issue reported where the connection are not released in case of an exception - See #37041 |
Looking again at
This message seems to be logged by this handler: https://github.com/vert-x3/vertx-redis-client/blob/4.4/src/main/java/io/vertx/redis/client/impl/RedisConnectionManager.java#L46, which is installed as the connection's exception handler here: https://github.com/vert-x3/vertx-redis-client/blob/4.4/src/main/java/io/vertx/redis/client/impl/RedisConnectionManager.java#L208 The connection's exception handler is called on several places. All of them but one pass an Unless I'm seriously totally absolutely wrong and mistaken, that's where the log message in this issue comes from. However, that's the end of the
This means that:
Item 1 seems like an exceptional, but expected situation -- errors do occur in the wild. From the "stacktrace" in the log message, the error seems to be a "closed channel" and it occured when writing [a command to Redis, probably?]. Item 2 is where it gets weird. If my analysis above is correct and the log message indeed comes from the |
Is it possible that using the mutiny part for the reactive part could induce this behavior? |
Hard to say. In general it actually does the opposite and capture more failures. However, we may have forgotten an exception handler (so a failure not propagated downstream, but reported separately) |
Ok, moreover we see in the monitoring graph that the number of connections increases up to 22, while the configured max is 6 (threshold not exceeded during the period when it works) |
that looks like a pool issue in native. How do you get the number of connections? Metrics? |
Yes, with redis metrics by prometheus : https://quarkus.io/guides/redis-reference#enable-metrics |
We deployed a version without native compilation, and we see the same problem.
|
Since quarkus 3.7.x, the problem is not reproducible. Thanks |
Describe the bug
When a connection is closed prematurely, it is not always released. We used redis-cli with Uni for reactive programming.
At the time of the problem, we have several log traces of this type :
quarkus redis configuration
redis_pool_active is full and after redis_pool_queue_size is full.
We use redis 7.2.2
Expected behavior
The redis connection must be correctly release after an error
Actual behavior
No response
How to Reproduce?
No response
Output of
uname -a
orver
No response
Output of
java -version
21
Quarkus version or git rev
3.5.0
Build tool (ie. output of
mvnw --version
orgradlew --version
)No response
Additional information
No response
The text was updated successfully, but these errors were encountered: