-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consumer disconnect during join -- hang #2615
Comments
Just repeating the test on version 1.2.1 and it is stable so far. Looking at the changelog, I wonder if this bug has been fixed here: Will report if the hang happens, otherwise I will close this issue once it is stable after 24hours or running. |
Unfortunately, this condition also happens in v1.2.1. Will enable debug "protocol" and post logs here soon. |
Logs for v1.2.1, looks like same problem, disconnect called during message consumption, then heartbeat failure caused a rejoin, disconnect callback never gets called:
|
After reading up on old issue tickets, i found issue 196 in node-rdkafka. I can confirm that without specifying rebalance_cb, the test has been running fine for the past 6hrs. So it appears that the hang issue still exists when rebalance_cb is used: |
Could you reproduce this with the upcoming v1.4.0 release (wait two weeks or grab v1.4.0-RC4 or librdkafka master) and set |
Description
I have 2 consumer clients, and during a graceful shutdown, they will disconnect from the broker. The issue comes when the first client disconnects, the second client receives a heartbeat failure (broker is rebalancing) and rejoins. If the second client disconnects during this process, the disconnect callback never gets called, leaving the client in a hang state.
How to reproduce
I created a test harness with 2 clients and on every 30s, it will disconnect and connect with the broker. This issue can be reproduced within 30-60 minutes of running the harness.
Checklist
1.1.0
2.12
default values
rhel7
debug=..
as necessary) from librdkafkaFrom the logs (of second consumer) above, it can be seen that it is rejoining on a rebalance (after first consumer disconnected). After this point, the consumer never gets a callback on the disconnect, and still thinks it is connected. Any calls to consume messages returns
KafkaConsumer is not connected
.Note: I am using node-rdkafka, but looking at the source, it is calling the consumer's close() method in librdkafka.
I can track this further if required, just need some guidance on which part of the code to focus on.
The text was updated successfully, but these errors were encountered: