Pool workers to handle incoming deliveries, ACKs etc. #906

stebet · 2020-07-08T10:43:51Z

After reading through a few issues (#876, #874) and doing some profiling I think I've found what the underlying issue is. Would love to get input from @danielmarbach, @bollhals and @bording on this.

Creating a new IModel, currently requires creating a new ConsumerDispatcher, which creates a WorkPool to handle incoming delivers, ACKs etc. This is all backed by a Channel and a Task that reads from that channel and executes it. The problem is, every time a new IModel is created, that means a new Task is created, which requires scheduling, which doesn't happen immediately, especielly under high churn. This effectively also makes it pretty expensive to both create and dispose of an IModel instance. You can easily see this by profiling a program that does nothing more than create and dispose a lot of IModel instances in the latest versions.

I think it would be very beneficial to keep a pool of WorkPool-like worker tasks on the Connection itself, which could be grabbed in the IConsumerDispatchers and those could then be handed the Work items to execute when these take place instead of scheduling a new Task for every model. The new concurrent processing configuration by @danielmarbach might need a little extra thought to handle parallel processing of deliveries per IModel using a shared pool, but if the Semaphores are still kept on the dispatchers which are per model, and those in turn decide how many tasks in parallel to hand over to the shared worker pool, I think that should still work fine.

So basically I'm proposing something like this:

Connection will have a shared pool of workers (inital size = Environment.ProcessorCount?, max size = something sane) to handle incoming actions/promises.
Model has a dispatcher which knows how many actions/promises it can process concurrently
When a model has work to do (BasicDeliver, BasicAck etc.) it increments it's Semaphore, grabs a worker from the connection worker pool (or creates a new one if none is available perhaps?), and schedules the work handing over it's semaphore as well, and when the worker is done executing, the Semaphore is released, and the worker returned to the pool. That means that if a new worker was created, the pool effectively grows larger as needed as well, and unless a lot of deliveries are taking place we should seldomly need to spin up new worker task, or in the worst case, if we have emptied the pool, the task is scheduled to the ThreadPool just like it was before.

This makes sense in my head at least, and could also increase throughput for high-churn scenarios.

michaelklishin · 2020-07-08T10:54:59Z

Note that at some point in the past we have had a dispatch pool per connection. I don't have an issue where this has been changed right now but the reasoning was to improve concurrency for some workloads. We expected that it would be problematic for high channel churn workloads but those workloads are problematic in many other ways. Channels are not meant to be very short lived.

stebet · 2020-07-08T11:29:16Z

Completely understand. In that case, this might be totally unnecessary, but at least it's food for thought :)

bollhals · 2020-07-08T18:07:53Z

Channels are not meant to be very short lived.

Also remember that since PR #878 you can share the channel from multiple threads, so opening / closing channels should also be needed less than before.

@stebet Do you have any data that shows where we spend how much time? (Just curious)

bojanz27 · 2021-09-20T16:12:09Z

Also remember that since PR #878 you can share the channel from multiple threads, so opening / closing channels should also be needed less than before.

If that is so, why not state it explicitly in the changelog? The phrasing that its now "safer but should be avoided" only adds to the confusion.

bollhals · 2021-09-20T20:49:35Z

Also remember that since PR #878 you can share the channel from multiple threads, so opening / closing channels should also be needed less than before.

If that is so, why not state it explicitly in the changelog? The phrasing that its now "safer but should be avoided" only adds to the confusion.

There is still the part about publish acknowledgments that are not thread safe yet. But I agree the documentation could be updated to reflect it more correctly.

stebet · 2021-10-18T16:12:35Z

Channels are not meant to be very short lived.

Also remember that since PR #878 you can share the channel from multiple threads, so opening / closing channels should also be needed less than before.

@stebet Do you have any data that shows where we spend how much time? (Just curious)

Not recent data no. I'll take a look at getting up to speed soon. Have been busy with a bunch of other projects recently, but I haven't forgotten you guys :)

lukebakken · 2024-05-31T20:42:49Z

Closing due to lack of activity and my wish to get as many issues closed prior to the v7 release.

lukebakken closed this as completed May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pool workers to handle incoming deliveries, ACKs etc. #906

Pool workers to handle incoming deliveries, ACKs etc. #906

stebet commented Jul 8, 2020

michaelklishin commented Jul 8, 2020

stebet commented Jul 8, 2020

bollhals commented Jul 8, 2020

bojanz27 commented Sep 20, 2021

bollhals commented Sep 20, 2021

stebet commented Oct 18, 2021

lukebakken commented May 31, 2024

Pool workers to handle incoming deliveries, ACKs etc. #906

Pool workers to handle incoming deliveries, ACKs etc. #906

Comments

stebet commented Jul 8, 2020

michaelklishin commented Jul 8, 2020

stebet commented Jul 8, 2020

bollhals commented Jul 8, 2020

bojanz27 commented Sep 20, 2021

bollhals commented Sep 20, 2021

stebet commented Oct 18, 2021

lukebakken commented May 31, 2024