Race condition where the HystrixMetricsPublisherThreadPool isn't referencing the correct ThreadPoolExecutor instance #270
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Race condition exists when constructing a HystrixThreadPool where the ThreadPoolExecutor used by HystrixThreadPool.Factory is not the same instance as the one that is associated with the HystrixMetricsPublisherThreadPool.
This causes a disconnect between the ThreadPoolExecutor's metrics and those supplied by HystrixMetricsThreadPool.
Now the ThreadPoolExecutor itself is retrieved from the HystrixMetricsThreadPool object, because it is protected behind the pattern of a ConcurrentMap#putIfAbsent. This seemed much more difficult to do with the ConcurrencyStrategy
as then any further custom implementations would not have similar concurrency guarantees.
A test has been added to show that the ThreadPoolExecutor construct is indeed correct and the same as the one associated with the metrics publisher. It was a bit difficult to construct a test to prove the bad case and good case at the same time. Once the solution was in place, I had to turn the failing test case into a validation test case.
In addition this exposed some static state that was kept around between some tests that needed to be cleaned up. In particular the HystrixMetricsPublisherFactory has a singleton object that needed to be reset (even though it's a singleton) in order to validate the test. I'm not particularly happy with this approach, so I'd be happy for any help here.