Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UnwrappingReuseStrategy for AnalyzerWrapper #14154

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mayya-sharipova
Copy link
Contributor

Add a new reuse strategy WrappingReuseStrategy that consults
he wrapped analyzer's strategy to decide if components can be
reused or need to be updated.

Add an optional constructor to CompletionAnalayzer that uses this strategy.

Add a new reuse strategy WrappingReuseStrategy that consults
the wrapped analyzer's strategy to decide if components can be
reused or need to be updated.

Add an optional constructor to CompletionAnalayzer that uses this
strategy.
Copy link

@john-wagster john-wagster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API is a little concerning. I do not immediately know a better way to handle both wrappedComponents and also the inner storedValue :(

Comment on lines 124 to 136
public CompletionAnalyzer(
Analyzer analyzer,
boolean preserveSep,
boolean preservePositionIncrements,
ReuseStrategy fallbackStrategy) {
super(new WrappingReuseStrategy(fallbackStrategy));
// häckidy-hick-hack, because we cannot call super() with a reference to "this":
((WrappingReuseStrategy) getReuseStrategy()).setUp(this, analyzer, wrappedComponents);
this.analyzer = analyzer;
this.preserveSep = preserveSep;
this.preservePositionIncrements = preservePositionIncrements;
this.maxGraphExpansions = ConcatenateGraphFilter.DEFAULT_MAX_GRAPH_EXPANSIONS;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

   /**
   * JAVA DOCS about when you should use this and what it does
   */
   public static CompletionAnalyzer withWrappingReuseStrategy(
      Analyzer analyzer,
      boolean preserveSep,
      boolean preservePositionIncrements,
      ReuseStrategy fallbackStrategy) {
    CompletionAnalyzer completionAnalyzer = new CompletionAnalyzer(analyzer, preserveSep, preservePositionIncrements, fallbackStrategy);
    ((WrappingReuseStrategy)completionAnalyzer.getReuseStrategy()).setUp(completionAnalyzer, analyzer, completionAnalyzer.wrappedComponents);
    return completionAnalyzer;
  }

What if we made the ctor private, removed the setup hack and did it instead within a public static method?

Comment on lines 172 to 179
public void setUp(
AnalyzerWrapper wrapper,
Analyzer wrappedAnalyzer,
CloseableThreadLocal<TokenStreamComponents> wrappedComponents) {
this.wrapper = wrapper;
this.wrappedAnalyzer = wrappedAnalyzer;
this.wrappedComponents = wrappedComponents;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bugs me. We allow a reuse strategy to be mutated after construction. It seems ready to cause a bug.


@Override
public TokenStreamComponents getReusableComponents(Analyzer analyzer, String fieldName) {
if (analyzer == wrapper) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to know the instance, or can we do Analyzer analyzer instanceof WrapperAnalyzer wrapperAnalyzer then get the strategy from wrapperAnalyzer.getWrappedAnalyzer(fieldName).getReuseStategy().getReusableComponents...?

Comment on lines 196 to 205
public void setReusableComponents(
Analyzer analyzer, String fieldName, TokenStreamComponents components) {
if (analyzer == wrapper) {
setStoredValue(analyzer, components);
wrappedAnalyzer
.getReuseStrategy()
.setReusableComponents(wrappedAnalyzer, fieldName, wrappedComponents.get());
} else {
fallbackStrategy.setReusableComponents(analyzer, fieldName, components);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to know the instance, or can we do Analyzer analyzer instanceof WrapperAnalyzer wrapperAnalyzer then get the wrapped analyzer via getWrappedAnalyzer(fieldName)?

* be re-created. During components creation, this analyzer must store the wrapped analyzer's
* components in {@code wrappedComponents} local thread variable.
*/
public static final class WrappingReuseStrategy extends ReuseStrategy {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the better name is UnWrapping as if we find that the analyzer is a wrapped analyzer, we unwrap it and use the underlying analyzer?

Comment on lines 172 to 179
public void setUp(
AnalyzerWrapper wrapper,
Analyzer wrappedAnalyzer,
CloseableThreadLocal<TokenStreamComponents> wrappedComponents) {
this.wrapper = wrapper;
this.wrappedAnalyzer = wrappedAnalyzer;
this.wrappedComponents = wrappedComponents;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see that the "wrappedComponents" is key here. its tricky, as the analyzer provides access directly to storedValue and there is no way to override :(.

That seems like a bad API already, but we don't want to change all these long lived interfaces unnecessarily.

@jpountz
Copy link
Contributor

jpountz commented Jan 22, 2025

I don't like that CompletionAnalyzer needs to track a thread-local, the point of reuse strategy is to avoid this kind of thing. Also I'm not sure I understand why CompletionAnalyzer is different from other analyzer wrappers?

@mayya-sharipova
Copy link
Contributor Author

@benwtrent Thanks for the review, I am not happy with the design either, will see how I can incorporate your feedback.

I don't like that CompletionAnalyzer needs to track a thread-local, the point of reuse strategy is to avoid this kind of thing. Also I'm not sure I understand why CompletionAnalyzer is different from other analyzer wrappers?

@jpountz Thanks for looking. The idea of having an extra thread-local is for AnalyzerWrapper to keep track of wrappedComponents, but may be it doesn't need to be thread-local and can be just a local variable.
You are right, CompletionAnalyzer is not different from other analyzer wrappers, and I can extend this policy to all AnalyzerWrapper. CompletionAnalyzer is where we experienced a problem in Elasticsearch, where it did not respect a reuse policy of the wrapped analyzer.

@mayya-sharipova mayya-sharipova changed the title Add WrappingReuseStrategy for AnalyzerWrapper Add UnwrappingReuseStrategy for AnalyzerWrapper Jan 23, 2025
@mayya-sharipova
Copy link
Contributor Author

@jpountz @benwtrent I've addressed your comments in the last commit, please continue to review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants