[FEATURE] Support higher vector dimension limit for lucene #925

ankitvij7 · 2023-06-03T21:49:23Z

Is your feature request related to a problem?
Lucene is the only engine that supports pre-filtering however it has a vector dimension limit of 1024. This kinds of limits the use of lucene with bigger models like OpenAI text embedding models. I do see that recently that support for pre-filtering was added to faiss, but not to nsmlib

What solution would you like?
The lucene dimension limit in is being actively discussed to be configurable in this Draft PR. However, Elastic with this PR overrides this limit to 2048. Can we do something similar for OpenSearch?

jmazanec15 · 2023-06-12T22:14:44Z

Hi @ankitvij7, for licensing reasons, we cannot refer to Elastic PRs, so we cannot look at the one above.

Id prefer to just maintain consistency with Lucene.

We are adding support for pre-filtering in faiss, btw: #903

ankitvij7 · 2023-06-22T19:04:16Z

@jmazanec15 Thanks for getting back. Is there a plan to support pre-filtering in nsmlib?

navneet1v · 2023-08-16T22:41:25Z

@ankitvij7 we don't have a plan for that. Reason being Nmslib currently doesn't have that feature.

vamshin · 2023-08-17T18:22:53Z

@ankitvij7 we have pre filter support in faiss in 2.9 version. Did you get chance to try that?

jmazanec15 · 2023-08-22T17:40:05Z

Closing issue - no activity

sam-herman · 2023-09-20T18:33:11Z

Hey @jmazanec15 , I am also interested in this one. do you mind if I will assign it to myself and take a stab at it?

sam-herman · 2023-09-21T15:46:43Z

Per our discussion on slack, it seems that this change is probably no longer needed because Lucene recently merged moving the max dimension limit to codec:
apache/lucene#12436

This allows users to customize the config without overriding it ad-hoc via OpenSearch.
I'm ok with closing this one as it seems as a no-op until OpenSearch will migrate to the latest Lucene version with the change.

heemin32 · 2023-09-21T16:24:59Z

We can keep this thread open for now. Once lucene is updated, we can incorporate the changes in knn repo to support higher dimension.

ankitvij7 added enhancement untriaged labels Jun 3, 2023

jmazanec15 removed the untriaged label Jun 12, 2023

jmazanec15 closed this as completed Aug 22, 2023

vamshin added this to Vector Search RoadMap Aug 22, 2023

github-project-automation bot moved this to Backlog in Vector Search RoadMap Aug 22, 2023

jmazanec15 reopened this Sep 20, 2023

github-project-automation bot moved this from Backlog to 2.10 (September 11th, 2023) in Vector Search RoadMap Sep 20, 2023

github-actions bot added the untriaged label Sep 20, 2023

jmazanec15 removed the untriaged label Sep 20, 2023

jmazanec15 assigned sam-herman Sep 20, 2023

navneet1v moved this from 2.10 (September 11th, 2023) to Backlog (Hot) in Vector Search RoadMap Oct 5, 2023

vamshin moved this from Backlog (Hot) to 2.12.0 in Vector Search RoadMap Oct 5, 2023

vamshin moved this from 2.12.0 to 2.13.0 in Vector Search RoadMap Nov 17, 2023

junqiu-lei assigned junqiu-lei and unassigned sam-herman Dec 8, 2023

junqiu-lei added Enhancements Increases software capabilities beyond original client specifications and removed enhancement labels Dec 12, 2023

junqiu-lei mentioned this issue Dec 12, 2023

Increase Lucene max dimension limit to 16,000 #1346

Merged

5 tasks

junqiu-lei closed this as completed in #1346 Dec 13, 2023

github-project-automation bot moved this from 2.13.0 to ✅ Done in Vector Search RoadMap Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support higher vector dimension limit for lucene #925

[FEATURE] Support higher vector dimension limit for lucene #925

ankitvij7 commented Jun 3, 2023 •

edited

Loading

jmazanec15 commented Jun 12, 2023

ankitvij7 commented Jun 22, 2023 •

edited

Loading

navneet1v commented Aug 16, 2023

vamshin commented Aug 17, 2023 •

edited

Loading

jmazanec15 commented Aug 22, 2023

sam-herman commented Sep 20, 2023

sam-herman commented Sep 21, 2023

heemin32 commented Sep 21, 2023

[FEATURE] Support higher vector dimension limit for lucene #925

[FEATURE] Support higher vector dimension limit for lucene #925

Comments

ankitvij7 commented Jun 3, 2023 • edited Loading

jmazanec15 commented Jun 12, 2023

ankitvij7 commented Jun 22, 2023 • edited Loading

navneet1v commented Aug 16, 2023

vamshin commented Aug 17, 2023 • edited Loading

jmazanec15 commented Aug 22, 2023

sam-herman commented Sep 20, 2023

sam-herman commented Sep 21, 2023

heemin32 commented Sep 21, 2023

ankitvij7 commented Jun 3, 2023 •

edited

Loading

ankitvij7 commented Jun 22, 2023 •

edited

Loading

vamshin commented Aug 17, 2023 •

edited

Loading