-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forced connection close on Bunny with ConnectionPool can't recover #589
Comments
Thank you for your time. Team RabbitMQ uses GitHub issues for specific actionable items engineers can work on. GitHub issues are not used for questions, investigations, root cause analysis, discussions of potential issues, etc (as defined by this team). We get at least a dozen of questions through various venues every single day, often light on details. Please post this to rabbitmq-users. Thank you. |
I'd only investigate issues that can be reproduced without 3rd party gems. Connection pooling can make all kinds of assumptions about the resources it pools, and pooling by definition is caching which introduces a significant layer of complexity. According to the log Bunny did begin to recover, hit an exception and because you have |
looks interesting. It happened when Bunny tried to shut down its consumer work pool. I have no idea what might have caused it but since this is during shutdown, we can only try to catch/rescue this exception and ignore it… but the root cause is not this obscure assertion violation, it's the fact that |
I wonder if what we did in 7b8f24e for the heartbeat sender thread should be done ot the I/O thread. No strong opinion here but it's largely a similar problem in a different place. A PR with a similar change to the main loop thread would be considered. |
Thanks for your time and elaboration @michaelklishin. After further testing I've been able to reduce the issue to the gem |
@fuegas if you can test with Bunny from a local repo, consider setting |
TIL (from mperham/connection_pool@306fd1b) that |
@michaelklishin : I've been able to make a minimal version to reproduce the issue:
Steps to reproduce:
Versions:
|
I've tried to dive deeper in what is happening. What I see as a difference in flow is that the unpooled version raises a Both pooled and unpooled are threaded and have The test file I load into IRB looks as follows: require 'bunny'
require 'connection_pool'
AMQP_URL = 'amqp://guest:guest@localhost/dev'.freeze
P1 = ConnectionPool.new(size: 1) do
connection = Bunny.new(
AMQP_URL,
# log_level: ::Logger::DEBUG,
).tap(&:start)
connection.create_channel
end.freeze
class BunnyPool
def with
yield channel
end
def channel
@channel ||= begin
connection = Bunny.new(
AMQP_URL,
# log_level: ::Logger::DEBUG,
).tap(&:start)
connection.create_channel
end
end
end
P2 = BunnyPool.new
def dump_threads
Thread.list.each {|t| puts t.inspect}; true
end
def pooled; P1.with(&:status); end
def unpooled; P2.with(&:status); end A run from IRB consists of dumping threads using In the logs of the runs below I've replaced the gem path with The unpooled version recovers from a forced disconnect:
The pooled version does not recover from a forced disconnect:
If I change def join
puts "[#{::File.basename(__FILE__)}:#{__LINE__}] Joining @thread: #{@thread.inspect} from #{Thread.current.inspect}"
if @thread == Thread.current
puts "[#{::File.basename(__FILE__)}:#{__LINE__}] Not joining current Thread"
return
end
@thread.join if @thread
end Then the recovery works as intended:
@michaelklishin: What is your opinion of changing def join
@thread.join if @thread
end into def join
@thread.join if @thread && @thread != Thread.current
end ? |
@fuegas yes, it matters in Ruby in what thread the exception was originated. In addition, I'd be happy to consider and test the change you are proposing but it would be great to get to the bottom of this. Semantically it makes sense to not join a thread with itself. I wonder if that could make a previously unhandled exception to be re-raised at the moment when the code absolutely does not expect it? While we don't need it, while researching the state of Ruby concurrency primitives in 2020 I've discovered Thread.handle_interrupt. |
After further investigation, I've found (https://github.com/mperham/connection_pool/blob/master/lib/connection_pool.rb#L58): def with(options = {})
Thread.handle_interrupt(Exception => :never) do
conn = checkout(options)
begin
Thread.handle_interrupt(Exception => :immediate) do
yield conn
end
ensure
checkin
end
end
end This means that the connection is made while inside |
Oh, great find. Man that's a bold decision to put something like that in a generic library. OK, since we have a sensible workaround and enough understanding, please submit a PR and I will test it a bit. Assuming no issues pop up, we can merge it and produce a release 👏 Thank you so much for sticking with the problem! |
Thank you as well for giving valuable pointers that helped me get to the core of the issue 👍 |
Hi,
I'm running into an issue using Bunny in a Rails project. I'm using the gem
connection_pool
to refrain from setting up a connection every time I want to send a message. The initializer for the pool looks as follows:I have a health check route which roughly does the following:
The
rabbitmq_health
method creates a reply queue and sends a ping message to it, nothing special here.However, when I restart RabbitMQ or close the connection from the RabbitMQ management interface I get the following error:
It tries to recover from a closed connection but then tries to join a thread which it is not allowed to join resulting in an exception and breaking that connection. From that moment that connection will return:
To resolve the broken connection the webservice (Puma) needs to be restarted.
Do I have a flaw in the way I'm using Bunny? Is this caused by the usage of ConnectionPool?
(Gem) Versions:
The text was updated successfully, but these errors were encountered: