-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Support higher vector dimension limit for lucene #925
Comments
Hi @ankitvij7, for licensing reasons, we cannot refer to Elastic PRs, so we cannot look at the one above. Id prefer to just maintain consistency with Lucene. We are adding support for pre-filtering in faiss, btw: #903 |
@jmazanec15 Thanks for getting back. Is there a plan to support pre-filtering in nsmlib? |
@ankitvij7 we don't have a plan for that. Reason being Nmslib currently doesn't have that feature. |
@ankitvij7 we have pre filter support in faiss in 2.9 version. Did you get chance to try that? |
Closing issue - no activity |
Hey @jmazanec15 , I am also interested in this one. do you mind if I will assign it to myself and take a stab at it? |
Per our discussion on slack, it seems that this change is probably no longer needed because Lucene recently merged moving the max dimension limit to codec: This allows users to customize the config without overriding it ad-hoc via OpenSearch. |
We can keep this thread open for now. Once lucene is updated, we can incorporate the changes in knn repo to support higher dimension. |
Is your feature request related to a problem?
Lucene is the only engine that supports pre-filtering however it has a vector dimension limit of 1024. This kinds of limits the use of lucene with bigger models like OpenAI text embedding models. I do see that recently that support for pre-filtering was added to faiss, but not to nsmlib
What solution would you like?
The lucene dimension limit in is being actively discussed to be configurable in this Draft PR. However, Elastic with this PR overrides this limit to 2048. Can we do something similar for OpenSearch?
The text was updated successfully, but these errors were encountered: