-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SpanBoostQuery.rewrite was incomplete for boost==1 factor [LUCENE-10477] #11513
Comments
Adrien Grand (@jpountz) (migrated from JIRA) It should be ok given that callers are expected to call |
Christine Poerschke (@cpoerschke) (migrated from JIRA) For 8.0 this should be the equivalent code area i.e. (one) rewrite call at line 309 and then line 311 weight creation: https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.0.0/lucene/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTermExtractor.java#L309-L311 |
Adrien Grand (@jpountz) (migrated from JIRA) Thanks Christine, this is where the bug lies in my opinion. It should rewrite multiple times if necessary like IndexSearcher does: https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.0.0/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L665-L671 |
Christine Poerschke (@cpoerschke) (migrated from JIRA)
Interesting, didn't know that. Would it be worth documenting e.g. in the javadocs? https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/Query.java#L73-L79 |
Christine Poerschke (@cpoerschke) (migrated from JIRA) #737 is the alternative pull request. |
ASF subversion and git services (migrated from JIRA) Commit ca252d6 in lucene's branch refs/heads/main from Christine Poerschke #11500, LUCENE-10477: WeightedSpanTermExtractor.extractWeightedSpanTerms to rewrite sufficiently (#737) |
ASF subversion and git services (migrated from JIRA) Commit e7367f3 in lucene's branch refs/heads/branch_9x from Christine Poerschke #11500, LUCENE-10477: WeightedSpanTermExtractor.extractWeightedSpanTerms to rewrite sufficiently (#737) (cherry picked from commit ca252d6) |
Christine Poerschke (@cpoerschke) (migrated from JIRA)
#758 opened for that. |
ASF subversion and git services (migrated from JIRA) Commit 779c332 in lucene's branch refs/heads/main from Christine Poerschke LUCENE-10477: mention 'call multiple times' in Query.rewrite javadoc (#758) |
ASF subversion and git services (migrated from JIRA) Commit ffb3168 in lucene's branch refs/heads/branch_9x from Christine Poerschke LUCENE-10477: mention 'call multiple times' in Query.rewrite javadoc (#758) (cherry picked from commit 779c332) |
Christine Poerschke (@cpoerschke) (migrated from JIRA) Thanks @jpountz for the collaboration here! |
Adrien Grand (@jpountz) (migrated from JIRA) @cpoerschke Should we backport it to 8.11 now that a 8.11.2 release is being considered? |
Christine Poerschke (@cpoerschke) (migrated from JIRA) re-opening for potential backport: apache/lucene-solr#2656 |
ASF subversion and git services (migrated from JIRA) Commit ece0f43b591d28cc7d41ff57b1db6ddcf4df6f8d in lucene-solr's branch refs/heads/branch_8_11 from Christine Poerschke #11500, LUCENE-10477: WeightedSpanTermExtractor.extractWeightedSpanTerms to rewrite sufficiently (#2656) Also mention 'call multiple times' in Query.rewrite javadoc. |
Alan Woodward (@romseygeek) (migrated from JIRA) Bulk close for 9.2.0 release |
(This bug report concerns pre-9.0 code only but it's so subtle that it warrants sharing I think and maybe fixing if there was to be a 8.11.2 release in future.)
Some existing code e.g. https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/lucene/queryparser/src/java/org/apache/lucene/queryparser/xml/builders/SpanNearBuilder.java#L54 adds a
SpanBoostQuery
even if there is no boost or the boost factor is1.0
i.e. technically wrapping is unnecessary.Query rewriting should counteract this somewhat except it might not e.g. note at https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/lucene/core/src/java/org/apache/lucene/search/spans/SpanBoostQuery.java#L81-L83 how the rewrite is a no-op i.e.
this.query.rewrite
is not called!This can then manifest in strange ways e.g. during highlighting:
This stacktrace is not from 8.11.1 code but the general logic is that at line 293 rewrite was called (except it didn't a full rewrite because of
SpanBoostQuery
wrapping around theSpanNearQuery
}) and so then at line 295 theIllegalArgumentException("Rewrite first!")
arises: https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.1/lucene/core/src/java/org/apache/lucene/search/spans/SpanMultiTermQueryWrapper.java#L101Migrated from LUCENE-10477 by Christine Poerschke (@cpoerschke), resolved May 18 2022
Linked issues:
Pull requests: apache/lucene-solr#2650, #758
The text was updated successfully, but these errors were encountered: