-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement feature sort_unstable #40601
Conversation
/// ``` | ||
#[stable(feature = "rust1", since = "1.0.0")] | ||
#[inline] | ||
pub fn sort_by<F>(&mut self, mut compare: F) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not added or anything. It was just reordered with sort_by_key
.
Previous order: sort
, sort_by_key
, sort_by
.
Current order: sort
, sort_by
, sort_by_key
.
/// ``` | ||
/// | ||
/// [pdqsort]: https://github.com/orlp/pdqsort | ||
// FIXME #40585: Mention `sort_unstable` in the documentation for `sort`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME because we shouldn't mention unstable functions in the documentation for stable functions.... right?
I assume we'll fix this as soon as the feature gets stabilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that seems reasonable to me, punting on this for now until the function is stabilized.
// to perform better than merge sort. For bigger types `T`, the threshold is smaller. | ||
// | ||
// Short runs are extended using insertion sort to span at least `min_run` elements, in order | ||
// to improve performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe it's worthwhile differentiating between big and small T
s - it doesn't have a large effect.
Instead, it's better to focus on some middle ground that works well all around: big/small T
s and expensive/cheap comparison functions as well. I took a look at a std::stable_sort
implementation in C++ and it has similar constants.
☔ The latest upstream changes (presumably #40598) made this pull request unmergeable. Please resolve the merge conflicts. |
I have troubles making Travis tests pass - the problem is in The compiler warns about unused future, and warnings are configured to be denied. Whatever I do, the compiler complains one way or another... Any idea how to fix that? :/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome implementation, thanks for the PR!
I didn't review the sorting itself in too close detail, but everything I looked at seemed quite solid.
src/libcollections/lib.rs
Outdated
@@ -52,6 +52,7 @@ | |||
#![feature(shared)] | |||
#![feature(slice_get_slice)] | |||
#![feature(slice_patterns)] | |||
#![feature(sort_unstable)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Travis is failing on this line during tests: https://travis-ci.org/rust-lang/rust/jobs/212204751
/// ``` | ||
/// | ||
/// [pdqsort]: https://github.com/orlp/pdqsort | ||
// FIXME #40585: Mention `sort_unstable` in the documentation for `sort`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that seems reasonable to me, punting on this for now until the function is stabilized.
src/libcollections/slice.rs
Outdated
/// # Examples | ||
/// | ||
/// ``` | ||
/// let mut v = [-5, 4, 1, -3, 2]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll need to add #![feature]
to this doctest to get past bors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That worked, thanks!
src/libcore/slice/sort.rs
Outdated
//! Unstable sorting is compatible with libcore because it doesn't allocate memory, unlike our | ||
//! stable sorting implementation. | ||
|
||
#![unstable(feature = "sort_unstable", issue = "40585")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be ok to omit this annotation I believe as it's a private module anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Oh your fix for |
Not long ago we discussed what guarantees |
I managed to close the gap between stable and unstable sorting on ascending & descending inputs (see the new benchmarks). :) One more thing: pdqsort falls back to heapsort after several imbalanced partitions. I'm pretty sure we never hit this case in our tests, so it'd be wise to write tests for heapsort ( |
☔ The latest upstream changes (presumably #40538) made this pull request unmergeable. Please resolve the merge conflicts. |
@stjepang Perhaps you can trigger the worst case using a modification of the technique described in the 1999 paper "A Killer Adversary for Quicksort" by M. D. McIlroy. I used it a couple years ago to prove that libc++'s sort routine is quadratic. |
Also, for those interested in a deeper explanation in the how/why of pattern-defeating quicksort, a (nearly finished) draft of my paper is available here: https://drive.google.com/file/d/0B1-vl-dPgKm_T0Fxeno1a0lGT0E/view There are some subtle differences between my C++ implementation and @stjepang's version due to design goals, so some parts might not apply, but most of it does. |
@stjepang I guess the new pub(...) syntax doesn't help here since code/test is in different crates? |
@arthurprs No, doesn't help unfortunately. They are totally separate crates. However, I did a workaround @alexcrichton suggested to me over IRC - expose the heapsort function publicly, but gate it through a dummy feature. Can't say I feel good about this hack, but apparently there's no better option at the moment. |
@bors: r+ Yeah perhaps one day we'll be able to move tests back into core (maybe selectively this time) but for now this is functionally the same (an internal function to the library) |
📌 Commit 038d605 has been approved by |
🔒 Merge conflict |
@bors: r+ |
📌 Commit a718051 has been approved by |
Implement feature sort_unstable Tracking issue for the feature: #40585 This is essentially integration of [pdqsort](https://github.com/stjepang/pdqsort) into libcore. There's plenty of unsafe blocks to review. The heart of pdqsort is `fn partition_in_blocks` and is probably the most challenging function to understand. It requires some patience, but let me know if you find it too difficult - comments could always be improved. #### Changes * Added `sort_unstable` feature. * Tweaked insertion sort constants for stable sort. Sorting integers is now up to 5% slower, but sorting big elements is much faster (in particular, `sort_large_big_random` is 35% faster). The old constants were highly optimized for sorting integers, so overall the configuration is more balanced now. A minor regression in case of integers is forgivable as we recently had performance improvements (#39538) that completely make up for it. * Removed some uninteresting sort benchmarks. * Added a new sort benchmark for string sorting. #### Benchmarks The following table compares stable and unstable sorting: ``` name stable ns/iter unstable ns/iter diff ns/iter diff % slice::sort_large_ascending 7,240 (11049 MB/s) 7,380 (10840 MB/s) 140 1.93% slice::sort_large_big_random 1,454,138 (880 MB/s) 910,269 (1406 MB/s) -543,869 -37.40% slice::sort_large_descending 13,450 (5947 MB/s) 10,895 (7342 MB/s) -2,555 -19.00% slice::sort_large_mostly_ascending 204,041 (392 MB/s) 88,639 (902 MB/s) -115,402 -56.56% slice::sort_large_mostly_descending 217,109 (368 MB/s) 99,009 (808 MB/s) -118,100 -54.40% slice::sort_large_random 477,257 (167 MB/s) 346,028 (231 MB/s) -131,229 -27.50% slice::sort_large_random_expensive 21,670,537 (3 MB/s) 22,710,238 (3 MB/s) 1,039,701 4.80% slice::sort_large_strings 6,284,499 (38 MB/s) 6,410,896 (37 MB/s) 126,397 2.01% slice::sort_medium_random 3,515 (227 MB/s) 3,327 (240 MB/s) -188 -5.35% slice::sort_small_ascending 42 (1904 MB/s) 41 (1951 MB/s) -1 -2.38% slice::sort_small_big_random 503 (2544 MB/s) 514 (2490 MB/s) 11 2.19% slice::sort_small_descending 72 (1111 MB/s) 69 (1159 MB/s) -3 -4.17% slice::sort_small_random 369 (216 MB/s) 367 (217 MB/s) -2 -0.54% ``` Interesting cases: * Expensive comparison function and string sorting - it's a really close race, but timsort performs a slightly smaller number of comparisons. This is a natural difference of bottom-up merging versus top-down partitioning. * `large_descending` - unstable sort is faster, but both sorts should have equivalent performance. Both just check whether the slice is descending and if so, they reverse it. I blame LLVM for the discrepancy. r? @alexcrichton
☀️ Test successful - status-appveyor, status-travis |
libcore: fix compilation on 16bit target (MSP430). Since PR rust-lang#40601 has been merged, libcore no longer compiles on MSP430. The reason is this code in `break_patterns`: ```rust let mut random = len; random ^= random << 13; random ^= random >> 17; random ^= random << 5; random &= modulus - 1; ``` It assumes that `len` is at least a 32 bit integer. As a workaround replace `break_patterns` with an empty function for 16bit targets. cc @stjepang cc @alexcrichton
libcore: fix compilation on 16bit target (MSP430). Since PR rust-lang#40601 has been merged, libcore no longer compiles on MSP430. The reason is this code in `break_patterns`: ```rust let mut random = len; random ^= random << 13; random ^= random >> 17; random ^= random << 5; random &= modulus - 1; ``` It assumes that `len` is at least a 32 bit integer. As a workaround replace `break_patterns` with an empty function for 16bit targets. cc @stjepang cc @alexcrichton
Prefer sort_unstable*() over sort*() Since `sort_unstable` is considered typically faster than the regular `sort` ([benchmarks](#40601 (comment))), it might be a good idea to check for potential wins in the compiler.
Tracking issue for the feature: #40585
This is essentially integration of pdqsort into libcore.
There's plenty of unsafe blocks to review. The heart of pdqsort is
fn partition_in_blocks
and is probably the most challenging function to understand. It requires some patience, but let me know if you find it too difficult - comments could always be improved.Changes
sort_unstable
feature.sort_large_big_random
is 35% faster). The old constants were highly optimized for sorting integers, so overall the configuration is more balanced now. A minor regression in case of integers is forgivable as we recently had performance improvements (Slightly optimize slice::sort #39538) that completely make up for it.Benchmarks
The following table compares stable and unstable sorting:
Interesting cases:
large_descending
- unstable sort is slightly faster. Both just check whether the slice is descending and if so, they reverse it. I believe the slight difference in performance is because unstable sort traces the slice forwards, while stable sort goes backwards.r? @alexcrichton