Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: use total cmp for ordered float #3059

Merged
merged 1 commit into from
Oct 29, 2024
Merged

perf: use total cmp for ordered float #3059

merged 1 commit into from
Oct 29, 2024

Conversation

chebbyChefNEQ
Copy link
Contributor

@chebbyChefNEQ chebbyChefNEQ commented Oct 29, 2024

TL;DR: PartialOrd impl for f32 requires two cmp instructions and PartialOrd::lt is not reduced to a single instruction. With total_cmp we avoid this overhead. See https://godbolt.org/z/ez4EaEs3W

Note: maybe we could gain more perf by implementing a NaN-less cmp?

V1:

Running query with nprobes: 1, refine_factor: 0
Time: 652.877µs

Running query with nprobes: 10, refine_factor: 0
Time: 818.418µs

Running query with nprobes: 50, refine_factor: 0
Time: 1.406433ms

Running query with nprobes: 100, refine_factor: 0
Time: 1.747243ms

Running query with nprobes: 512, refine_factor: 0
Time: 5.754752ms

V3 before perf fix

Running query with nprobes: 1, refine_factor: 0
Time: 752.659µs

Running query with nprobes: 10, refine_factor: 0
Time: 2.379638ms

Running query with nprobes: 50, refine_factor: 0
Time: 9.518057ms

Running query with nprobes: 100, refine_factor: 0
Time: 17.623498ms

Running query with nprobes: 512, refine_factor: 0
Time: 39.43036ms

V3 after perf fix

Running query with nprobes: 1, refine_factor: 0
Time: 644.933µs

Running query with nprobes: 10, refine_factor: 0
Time: 1.512715ms

Running query with nprobes: 50, refine_factor: 0
Time: 5.974101ms

Running query with nprobes: 100, refine_factor: 0
Time: 9.848213ms

Running query with nprobes: 512, refine_factor: 0
Time: 23.058848ms

@chebbyChefNEQ chebbyChefNEQ merged commit 411568f into main Oct 29, 2024
29 checks passed
@chebbyChefNEQ chebbyChefNEQ deleted the rmeng/float-perf branch October 29, 2024 03:13
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.58%. Comparing base (4812ac0) to head (4f96145).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3059   +/-   ##
=======================================
  Coverage   77.57%   77.58%           
=======================================
  Files         240      240           
  Lines       78636    78636           
  Branches    78636    78636           
=======================================
+ Hits        61005    61012    +7     
- Misses      14496    14498    +2     
+ Partials     3135     3126    -9     
Flag Coverage Δ
unittests 77.58% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants