-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR-16675: dense vector function queries #1750
SOLR-16675: dense vector function queries #1750
Conversation
solr/core/src/test/org/apache/solr/search/function/TestDenseVectorValueSourceParser.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just wondering if the vector parsing can be done differently (re-using existent code).
I'll investigate and discuss with @eliaporciani , but in the meantime, any other review is welcome!
Also basic documentation and changes is missing |
This: Not that I want to modify the old one, I believe the proposed syntax for the new similarity function is better, but wanted to list the other existing function query for coherence and completeness. I don't think it's a big deal to have the "distance" function with a syntax and the "similarity" function with a different one for vectors. |
When updating Lucene, please also fix the startup scripts to pass vector incubator module in command line if Java version is exactly Java 20 or 21. Otherwise Lucene prints warnings and people will not get optimal performance. So I would really make the Lucene upgrade a separate issue/PR. All documentation should be updated, all startup scripts fixed,.... |
Yes, it's a separate one, as soon as it's merged we'll proceed with this
one
…On Thu, 6 Jul 2023, 17:44 Uwe Schindler, ***@***.***> wrote:
When updating Lucene, please also fix the startup scripts to pass vector
incubator module in command line if Java version is exactly Java 20 or 21.
Otherwise Lucene prints warnings and people will not get optimal
performance.
So I would really make the Lucene upgrade a separate issue/PR. All
documentation should be updated, all startup scripts fixed,....
—
Reply to this email directly, view it on GitHub
<#1750 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD5JK22OQ3ORNO6KWYXV7DXO3MPTANCNFSM6AAAAAAZ44PWNY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
@uschindler the Lucene upgrade is tracked under #1749 I am looking at what needs to be added to the startup script now |
c94893d
to
04b31e4
Compare
@alessandrobenedetti there are some issues with the string formatting, namely you provide args but don't use "%s". You can probably just replace with a string concatenation. |
I agree, not sure why locally the precommit succeeded.
Anyway it's just one param to print so I'll change Elia's pr and make it
simple,
…On Fri, 7 Jul 2023, 19:21 Houston Putman, ***@***.***> wrote:
@alessandrobenedetti <https://github.com/alessandrobenedetti> there are
some issues with the string formatting, namely you provide args but don't
use "%s". You can probably just replace with a string concatenation.
—
Reply to this email directly, view it on GitHub
<#1750 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD5JK4G7KYRP5AB64GSDNDXPBAP7ANCNFSM6AAAAAAZ44PWNY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
8745fb4
to
9a40aed
Compare
Added unit tests for parsing vectors
Added unit tests
Handled whitespace in vector parsing
9a40aed
to
4c8616a
Compare
--------- Co-authored-by: Alessandro Benedetti <a.benedetti@sease.io>
--------- Co-authored-by: Alessandro Benedetti <a.benedetti@sease.io>
https://issues.apache.org/jira/browse/SOLR-16675
Description
Add function queries for dense vector that can be used a rerank time.
Solution
Use the latest changes in LUCENE for apache/lucene#12253 I added a parser in ValueSourceParser:
vectorEncoding: FLOAT32 or BYTE
similarityFunction: COSINE, DOT_PRODUCT, EUCLIDEAN
valueSources: here it is accepted or a const vector (e.g. [1,2,3...]) or a fieldName of a DenseVectorField.
Tests
Unit tests for the vector parsing in FunctionQParser:
Integration tests:
Checklist
Please review the following and check all that apply:
main
branch../gradlew check
.