Lazily compute similarity score when reuse the old HNSW graph #12236

zhaih · 2023-04-19T17:07:56Z

Description

In #12050 we added ability to reuse old graph as an initializer to speed up the merge, but even when we are re-inserting the old graph's node, we still need to calculate a similarity score here so that we can pop-out the worst non-diverse node here based on a sorted order.
But since the score is only used for diversity checking purpose, we probably do not even need them in some cases (like when we never reach the level connection limit). So we probably can first insert those nodes without calculating the score, then when we eventually need to pop out a worst node, we can calculate the score, sort the neighbor array and then do the normal "find the worst node" procedure.

…zeFromGraph

zhaih added the type:enhancement label Apr 19, 2023

zhaih changed the title ~~Lazily compute similarity score when reuse the old graph~~ Lazily compute similarity score when reuse the old HNSW graph Apr 19, 2023

Jackyrie2 added a commit to Jackyrie2/lucene that referenced this issue Jun 14, 2023

[Draft] apache#12236 Lazily compute similarity score in HNSW initiali…

f5fd0c4

…zeFromGraph

Jackyrie2 mentioned this issue Jun 14, 2023

[Draft] #12236 Lazily compute similarity score #12371

Closed

zhaih linked a pull request Aug 1, 2023 that will close this issue

Enhancement 11236 lazy compute similarity score #12480

Merged

zhaih closed this as completed in #12480 Sep 1, 2023

zhaih added this to the 9.8.0 milestone Sep 20, 2023

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Sep 20, 2023

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Sep 20, 2023

alessandrobenedetti added the vector-based-search label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazily compute similarity score when reuse the old HNSW graph #12236

Lazily compute similarity score when reuse the old HNSW graph #12236

zhaih commented Apr 19, 2023 •

edited

Loading

Lazily compute similarity score when reuse the old HNSW graph #12236

Lazily compute similarity score when reuse the old HNSW graph #12236

Comments

zhaih commented Apr 19, 2023 • edited Loading

Description

zhaih commented Apr 19, 2023 •

edited

Loading