Addressing IllegalStateException due to double free of Connection reference by the Transport #34122
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The problem - IllegalStateException
Some users reported that they see IllegalStateException (ISE) randomly.
The reason for IllegalStateException
While there is no consistent repro, we were able to experience ISE randomly with network disconnect. This section documents the reason from stack-trace and code analysis.
The Qpid Connection object is a reference counted endpoint. As each referring Qpid resource (Reactor, Transport, Session) using it gets freed, the reference count gets decremented.
The association (binding) of Qpid Transport with the Qpid Connection increases the Connection's reference count by one.
At the end of the Transport's life, two things (of many) need to happen -
The Transport::unbind API takes care of both, i.e., in addition to unbinding, the API reduces the Connection's reference count.
The code flow leads to IllegalStateException
The 'ConnectionHandler' has a transport_error event callback performing Transport unbind.
The Qpid 'Default Global IO Handler' also performs Transport unbind on the transport_closed event.
While Transport is only ONE logical reference to the Connection, two terminal (transport_error, transport_closed) events of the same Transport may reduce the reference count by TWO. When the stack runs into this undesired reference count reduction, it MAY lead to IllegalStateException (ISE).
The reason for saying, on transport_closed event, IllegalStateException "MAY" happen, making ISE random, is - as the circumstance also requires all other referring resources to reduce the reference count so that the value is zero by the time the 'Default Global IO Handler' get to handle the transport_closed event.
The expected code behavior for correctness
One may ask - Why is it required to unbind on transport_error when transport_closed already does unbind? Digging the history indicates that Qpid IO may only trigger transport_error but not transport_closed.
For correctness, the unbind needs to happen at least once (else memory leak) and at most once (else undesired reference count reduction) in the following 3 mutually exclusive cases -
The transport_closed handling in TransportHandler
The Azure Amqp-Core has a type TransportHandler. This Handler has a callback for transport_closed event, with a logic that checks if it's safe to unbind.
The safety check is via 'connection.getTransport'; if it returns null, it means unbind is already done, so don't do another unbind potentially causing undesired reference count reduction.
But why the TransportHandler is not preventing ISE today? The TransportHandler is "added" as the Global handler after the Qpid 'Default Global IO Handler',
since TransportHandler is the second Global Handler, the first handler i.e. Qpid 'Default Global IOHandler' doing unbind without safetly check and when reference count is already zero can cause IllegalStateException.
Addressing the problem
To address the problem, change TransportHandler to inherit from current 'Default Qpid Global IOHandler' and make it the "only" Global Handler. Such that,
For better readability, rename the 'TransportHandler' to 'GlobalIOHandler'.
Additional notes