Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: one-pass IVF_PQ accelerated builds #3001

Merged
merged 6 commits into from
Oct 14, 2024
Merged

Conversation

jacketsj
Copy link
Contributor

This feature improves disk IO dependence, but it is quite limited. This only works if the index type is IVF_PQ, and it will not work efficiently for local PQ in the future (unless we store all the PQ models in VRAM).
Importantly, this allows us to bypass local temp storage for storing residuals. However, this still stores PQ codes locally temporarily due to how we've implemented accelerator support, but these are much smaller (exact ratio depends on params).

I tested on my local machine, which is sufficiently fast that the accelerated builds are IO limited (but IO is also fast). I used wikipedia-40M

New feature disabled:
results_static_20241011_224535_plot_dataset_wikipedia-few-queries_k_10
ivf training time: 52s
ivf transform time: 89s
pq training time: 18s
pq assignment time: 143s
create_index rust time: 8.9s

New feature enabled:
results_static_20241011_203303_plot_dataset_wikipedia-few-queries_k_10
combined training time: 63.7s (not actually sure why this is faster, but it's not the big part anyway)
combined transform time: 158.8s
create_index rust time: 8.6s

Improvement should be more noticeable for bigger datasets, as usual.

@github-actions github-actions bot added enhancement New feature or request python labels Oct 12, 2024
@jacketsj jacketsj marked this pull request as ready for review October 12, 2024 06:15
@@ -1448,6 +1448,7 @@ def create_index(
precomputed_partition_dataset: Optional[str] = None,
storage_options: Optional[Dict[str, str]] = None,
filter_nan: bool = True,
one_pass_ivfpq: bool = False,
Copy link
Contributor

@chebbyChefNEQ chebbyChefNEQ Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just make one pass the only choice for IVFPQ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather we don't, since it is incompatible with local pq. I'm not strongly against making it default though.


# Handle timing for various parts of accelerated builds
timers = {}
if one_pass_ivfpq and accelerator is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's extract this routine to a separate function?

Copy link
Contributor Author

@jacketsj jacketsj Oct 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm staring at this right now and imo this is gonna become an even bigger mess than it already is if I do that. Unfortunately, the way this function is set up right now, it uses a large number of variables to hold state. So if we separated it out, we'd be passing some enormous list of params, and returning a giant tuple of 7 things (if we include the timers, which would also have to be combined to be clean). This whole function should probably be refactored at some point once we finish up adding some of the related features.

Copy link
Contributor

@chebbyChefNEQ chebbyChefNEQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, let's add a test before I approve?

Just call with accelerator=torch.device("cpu") and check recall?

Copy link
Contributor

@chebbyChefNEQ chebbyChefNEQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you test in-sample recall too? we have a util function for this https://github.com/lancedb/lance/blob/main/python/python/lance/util.py#L174

@jacketsj jacketsj merged commit d207aa8 into main Oct 14, 2024
14 checks passed
@jacketsj jacketsj deleted the jack/one-pass-ivfpq-accel branch October 14, 2024 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants