Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10318: Introduce HNSW merge from graph prototype #11719

Closed
wants to merge 8 commits into from

Conversation

jmazanec15
Copy link
Contributor

Description

Prototype for initializing new graph during merge from an existing graph. Graph selected for initialization must have 0 deleted docs and have the comparatively the largest number of vectors in it.

As next steps, I will run some experiments with this prototype to determine if this improves indexing/merge speed. Ill post results in #11354

Introduces prototype for initializing new graph during merge from an
existing graph. Graph selected for initialization must have 0 deleted
docs and have the comparatively the largest number of vectors in it.

In order to implement this, capability of inserting out of order nodes
into the graph had to be introduced. Additionally, scores for
initializer graph had to be recomputed.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Avoids using binary search to find the correct position to insert a node
into the array that tracks the nodes present in a level. Common cases
occur when insertions are in order.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
Connects graph to node 0 on merge so that the graph is able to stay
connected during merge. This allows nodes outside of the initializer
graph to connect to the initializer graph.

Comment out test asserting that graphs are equal when building with
merge versus without.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Refactors initialize from graph. Fixes initializerGraph ordinal to new
graph ordinal mapping. Refactoring in HNSWGraphBuilder.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Further refactoring on tests and merge implementation.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Uses the vectorsCopy in HnswGraphBuilder when recomputing scores of the
old graphs so that the ordering of neighbors is correct. Changes test to
assert graphs are equal when no skips are provided.

Signed-off-by: John Mazanec <jmazane@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants