Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make SliceExecutor extensible and include method to computeSlices. This will allow different custom implementation to be plugged in with IndexSearcher to compute and execute slices on provided executor #12348

Closed
wants to merge 1 commit into from

Conversation

sohami
Copy link
Contributor

@sohami sohami commented Jun 5, 2023

Description

As mentioned in the issue #12347, this PR is allowing to extend SliceExecutor and move the slice computation in it. This will allow the extensions of IndexSearcher to provide custom SliceExecutor and also custom logic to compute and execute the slices for an index.

…is will allow different custom implementation

to be plugged in with IndexSearcher to compute and execute slices on provided executor
Copy link
Contributor

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, I left some comments. I am curious to hear what others think about it as well, for instance @jpountz .

*
* @lucene.experimental
*/
public IndexSearcher(IndexReaderContext context, Executor executor, SliceExecutor sliceExecutor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should make this constructor public, or instead make the getSliceExecutionControlPlane method public. Would this second option work for what you need or would you still run into issues as it's called from the IndexSearcher constructor a bit like slices?

Otherwise I'd think we should look at having a single way, either provide an Executor or a SliceExecutor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should make this constructor public, or instead make the getSliceExecutionControlPlane method public. Would this second option work for what you need or would you still run into issues as it's called from the IndexSearcher constructor a bit like slices?

Yes that would result in same issue

Otherwise I'd think we should look at having a single way, either provide an Executor or a SliceExecutor.

There are already constructors which takes in Executor. So was unable to modify this one to just take in SliceExecutor. I also added an assertion for this constructor that executor instance is same as one in SliceExecutor

@@ -55,4 +74,8 @@ boolean shouldExecuteOnCallerThread(int index, int numTasks) {
// Execute last task on caller thread
return index == numTasks - 1;
}

public Executor getExecutor() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there would be no need for this method if we made getSliceExecutionControlPlane public


/**
* Executor which is responsible for execution of slices based on the current status of the system
* and current system load
*/
class SliceExecutor {
public class SliceExecutor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking this class should be marked experimental if it's made public

* @return computed slices
*/
public LeafSlice[] computeSlices(List<LeafReaderContext> leaves) {
return IndexSearcher.slices(leaves, MAX_DOCS_PER_SLICE, MAX_SEGMENTS_PER_SLICE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it'd make sense to move the slices static method from IndexSearcher to SliceExecutor, so that we don't have to call IndexSearcher.slices from here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't moved it to avoid breaking consumers as static slices method is public

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could still have the existing method call the one exposed by the SliceExecutor.

@@ -319,11 +323,10 @@ public QueryCachingPolicy getQueryCachingPolicy() {

/**
* Expert: Creates an array of leaf slices each holding a subset of the given leaves. Each {@link
* LeafSlice} is executed in a single thread. By default, segments with more than
* MAX_DOCS_PER_SLICE will get their own thread
* LeafSlice} is executed in a single thread.
*/
protected LeafSlice[] slices(List<LeafReaderContext> leaves) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of overriding this if we are making SliceExecutor public in order for it to be extended. I would remove this method from IndexSearcher.

Copy link
Contributor Author

@sohami sohami Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it like this to avoid breaking extensions already overriding this method. I wanted to backport this change to 9.x branch as well so wanted to avoid any breaking change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good thinking, but it may be ok to make this breaking change, depending on how the discussion progresses in the issue, and then make this backwards compatible when backporting to 9.x

Collections.sort(leavesList, Comparator.comparingInt(l -> l.docBase));
this.leaves = leavesList.toArray(new LeafReaderContext[0]);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's necessary to move this class to be an outer class. It is marked experimental, so we should be free to move it, yet I would tend to avoid breaking consumers if possible.

Copy link
Contributor Author

@sohami sohami Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack I can move it back since it was getting used by IndexSearcher and SliceExecutor, I thought of moving it out. But you are right it can break the consumers.

return IndexSearcher.slices(leaves, MAX_DOCS_PER_SLICE, MAX_SEGMENTS_PER_SLICE);
}

public void invokeAll(Collection<? extends Runnable> tasks) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add javadocs now that the method is public?

@@ -55,4 +74,8 @@ boolean shouldExecuteOnCallerThread(int index, int numTasks) {
// Execute last task on caller thread
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make shouldExecuteOnCallerThread also public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see a need for that as it is internal to the implementation of SliceExecutor. A custom SliceExecutor can define its own implementation of invokeAll as needed.

@sohami
Copy link
Contributor Author

sohami commented Jun 8, 2023

Thanks for working on this, I left some comments. I am curious to hear what others think about it as well, for instance @jpountz .

@javanna Thanks for taking a look, have responded to your feedback.

@jpountz
Copy link
Contributor

jpountz commented Jun 9, 2023

I am curious to hear what others think about it as well, for instance @jpountz .

FYI I dropped a comment on the linked issue as I have questions on the general approach we should take.

@sohami
Copy link
Contributor Author

sohami commented Jul 11, 2023

Closing this PR as it is solved by #12374

@sohami sohami closed this Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants