-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow extensions of IndexSearcher to provide custom SliceExecutor and slices computation #12347
Comments
I wonder if we should make SliceExecutor configurable, or if we should instead fix it? Having both the Executor and the SliceExecutor be user-configurable feels too much to me, we should either make the executor configurable, or if we think that this abstraction is not sophisticated enough then have a more complex abstraction configurable whole concrete implementations would likely wrap an executor? SliceExecutor is based on the assumption that it is sometimes a good idea to run in the current thread instead of delegating to the executor. I have become increasingly doubtful about this because it make it very hard to reason about the total number of threads that may run searches, since the threadpool that would call |
I agree that we should either have the executor or the slice executor configurable. On fixing the slice executor without making it configurable, my guts feeling is that it is going to be difficult to guarantee default behaviour that is good for all the consumers. I agree that executing slices on either the caller thread or the executor makes it difficult to reason about what happens where and to appropriately size thread pools. Though to have a clear distinction between a coordinator thread pool and a collection thread pool, sequential execution should also be offloaded to the executor which is somehow controversial. I am a bit nervous about making this the default, hence the thought on exposing a |
Thanks @jpountz and @javanna for the discussion. I also agree that we should have either of
I agree here that exposing
I am not sure on this as currently |
heya @sohami, while I suggested making SliceExecutor pluggable, I also understand the implications that Adrien brought up. Especially, it does not feel like configuring slices is the right reason to make slice executor pluggable at this stage. We may still do it in the future. In order to unblock you I'll make a counter proposal: would it work to provide the slices as a constructor argument instead, for cases where the default slicing does not satisfy your requirements? Assuming that you have access to the |
@javanna I am looking to provide both custom |
My suggestion would be to take a |
In default, SliceExecutor (i.e. QueueSizeBasedExecutor) it applies the backpressure in concurrent execution based on a limiting factor of 1.5 times the passed in threadpool maxPoolSize. If the queue size goes beyond that it ends up executing all the tasks in caller thread. In OpenSearch, there is a search threadpool which serves the search request to all the lucene indices (or OpenSearch shards) assigned to a node. Each node can get the requests to some or all the indices on that node. |
heya @sohami thanks a lot for sharing more context.
Good point, agreed. Also, QueueSizeBasedExecutor is quite opinionated and non configurable, and it gets applied based on an instanceof check on the provided executor which is not fantastic. Another thought on my end: executing sometimes on the caller thread, and sometimes on the executor makes things hard to reason about: how do you size the two thread pools if you can't easily tell what load they are subjected to? Instead of making the slice executor configurable then, I would considering removing it entirely, and forcing the collection to always to happen on the separate thread pool. I think we'll need to figure out how to handle rejections from the executor thread pool, as today the collection happens on the caller thread whenever there's a rejection which I don't think is a behaviour we want to keep. We could also leave this to the executor implementation that is provided. I believe that the QueueSizeBasedExecutor was contributed by OpenSearch: would the approach suggested above be feasible for you folks? I am thinking it would simplify things and provide a better user experience for Lucene users. |
@javanna Thanks for your input.
As you mentioned earlier as well (and I agree) it is hard to understand the default which works best for all the usage. So providing a way to customize it will provide the flexibility to the users to adhere to their use cases. I think that way we can see what custom mechanism used across users works well and then change the default later as needed. I would also like to try to just remove the limiting factor but keep the mechanism to execute the last slice on the caller thread, so I think for now can we split the issue into 2. We can potentially make the change for 1st one now and follow up with 2nd one. Thoughts ?
|
Sounds good, I'll happily review that change.
Ok to discussing, I do think that making things pluggable is a change that's difficult to revert in terms of backwards compatibility, and I think we should put some effort into changing the current behaviour before we add new public abstractions. |
Strong -1 t replacing the interface. I think it has worked well for many users for a while and it would be breaking back compatibility to serve a specific use case. I am just catching up on this thread -- why does the current SliceExecutor not work for extension in this case? |
@atris To summarize, there are 2 separate functionality I am looking to add:
|
@atris which interface do you mean? It's a package private class. There is no breaking change associated to that. I am talking about improving the default behaviour around concurrent execution of slices, as opposed to making the execution code a public interface. Flexibility can be nice, but it required consumers to implement their strategy, while I am hoping that we can come up with default behaviour that all our users can benefit from, out of the box. |
Description
For concurrent segment search, lucene uses the slices method to compute the number of work units which can be processed concurrently.
a) It calculates slices in the constructor of IndexSearcher with default thresholds for document count and segment counts.
b) Provides an implementation of SliceExecutor (i.e. QueueSizeBasedExecutor) based on executor type which applies the backpressure in concurrent execution based on a limiting factor of 1.5 times the passed in threadpool maxPoolSize.
In OpenSearch, there is a search threadpool which serves the search request to all the lucene indices (or OpenSearch shards) assigned to a node. Each node can get the requests to some or all the indices on that node.
I am exploring a mechanism such that I can dynamically control the max slices for each lucene index search request. For example: search requests to some indices on that node to have max 4 slices each and others to have 2 slices each. Then the threadpool shared to execute these slices does not have any limiting factor. In this model the top level search threadpool will limit the number of active search requests which will limit the number of work units in the SliceExecutor threadpool.
For this the derived implementation of IndexSearcher can get an input value in the constructor to control the slice count computation. Even though the
slice
method isprotected
it gets called from the constructor of baseIndexSearcher
class which prevents the derived class from using the passed in input.To achieve this I am making change along the lines as suggested on discussion thread in dev mailing list to get some feedback
The text was updated successfully, but these errors were encountered: