Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace coco with crossbeam-deque #480

Closed
wants to merge 1 commit into from
Closed

Replace coco with crossbeam-deque #480

wants to merge 1 commit into from

Conversation

ghost
Copy link

@ghost ghost commented Nov 26, 2017

Coco is now deprecated and I don't intend to maintain it anymore. You should use Crossbeam instead.

This PR switches the dependency from coco to crossbeam-deque. We still haven't re-exported the deque into the main crossbeam crate, but you can start using crossbeam-deque right now.

Benchmarks:

name                                                                before ns/iter         after ns/iter          diff ns/iter   diff %
factorial::factorial_iterator                                       24,802,712             24,427,893                 -374,819   -1.51%
factorial::factorial_join                                           10,636,501             10,459,655                 -176,846   -1.66%
factorial::factorial_par_iter                                       10,243,680             10,196,058                  -47,622   -0.46%
factorial::factorial_recursion                                      12,202,069             11,909,615                 -292,454   -2.40%
fibonacci::fibonacci_iterative                                      21                     21                                0    0.00%
fibonacci::fibonacci_join_1_2                                       60,217,053             58,844,783               -1,372,270   -2.28%
fibonacci::fibonacci_join_2_1                                       58,738,255             58,375,573                 -362,682   -0.62%
fibonacci::fibonacci_recursive                                      13,976,426             13,532,543                 -443,883   -3.18%
fibonacci::fibonacci_split_iterative                                24,218                 24,097                         -121   -0.50%
fibonacci::fibonacci_split_recursive                                5,582,896              6,771,896                 1,189,000   21.30%
find::size1::parallel_find_common                                   8,543                  8,466                           -77   -0.90%
find::size1::parallel_find_first                                    5,095                  4,828                          -267   -5.24%
find::size1::parallel_find_last                                     4,232,507              4,211,524                   -20,983   -0.50%
find::size1::parallel_find_middle                                   2,827,812              2,814,623                   -13,189   -0.47%
find::size1::parallel_find_missing                                  4,267,884              4,261,439                    -6,445   -0.15%
find::size1::serial_find_common                                     3,858                  3,811                           -47   -1.22%
find::size1::serial_find_first                                      1                      1                                 0    0.00%
find::size1::serial_find_last                                       4,258,009              4,251,158                    -6,851   -0.16%
find::size1::serial_find_middle                                     2,840,747              2,829,840                   -10,907   -0.38%
find::size1::serial_find_missing                                    4,259,732              4,256,293                    -3,439   -0.08%
join_microbench::increment_all                                      38,938                 39,263                          325    0.83%
join_microbench::increment_all_atomized                             2,690,370              2,791,023                   100,653    3.74%
join_microbench::increment_all_max                                  84,234                 86,980                        2,746    3.26%
join_microbench::increment_all_min                                  31,055                 31,073                           18    0.06%
join_microbench::increment_all_serialized                           40,463                 39,959                         -504   -1.25%
join_microbench::join_recursively                                   971,703                1,068,777                    97,074    9.99%
life::bench::generations                                            137,481,702            136,128,634              -1,353,068   -0.98%
life::bench::parallel_generations                                   56,883,058             56,711,525                 -171,533   -0.30%
map_collect::i_mod_10_to_i::with_collect                            7,930,087              7,940,144                    10,057    0.13%
map_collect::i_mod_10_to_i::with_fold                               3,827,093              3,720,073                  -107,020   -2.80%
map_collect::i_mod_10_to_i::with_fold_vec                           4,270,208              3,990,564                  -279,644   -6.55%
map_collect::i_mod_10_to_i::with_linked_list_collect                16,418,905             16,171,549                 -247,356   -1.51%
map_collect::i_mod_10_to_i::with_linked_list_collect_vec            6,861,379              6,871,669                    10,290    0.15%
map_collect::i_mod_10_to_i::with_linked_list_collect_vec_sized      8,433,234              8,369,911                   -63,323   -0.75%
map_collect::i_mod_10_to_i::with_linked_list_map_reduce_vec_sized   7,922,071              7,932,609                    10,538    0.13%
map_collect::i_mod_10_to_i::with_mutex                              66,231,890             63,167,656               -3,064,234   -4.63%
map_collect::i_mod_10_to_i::with_mutex_vec                          9,807,928              9,435,930                  -371,998   -3.79%
map_collect::i_mod_10_to_i::with_vec_vec_sized                      8,002,014              7,975,449                   -26,565   -0.33%
map_collect::i_to_i::with_collect                                   23,431,191             23,104,792                 -326,399   -1.39%
map_collect::i_to_i::with_fold                                      70,726,359             70,917,388                  191,029    0.27%
map_collect::i_to_i::with_fold_vec                                  70,012,099             71,522,437                1,510,338    2.16%
map_collect::i_to_i::with_linked_list_collect                       33,158,066             33,444,856                  286,790    0.86%
map_collect::i_to_i::with_linked_list_collect_vec                   34,426,809             34,209,519                 -217,290   -0.63%
map_collect::i_to_i::with_linked_list_collect_vec_sized             23,890,570             23,772,744                 -117,826   -0.49%
map_collect::i_to_i::with_linked_list_map_reduce_vec_sized          23,226,376             23,244,541                   18,165    0.08%
map_collect::i_to_i::with_mutex                                     106,359,671            105,669,074                -690,597   -0.65%
map_collect::i_to_i::with_mutex_vec                                 46,162,219             45,473,755                 -688,464   -1.49%
map_collect::i_to_i::with_vec_vec_sized                             23,376,598             23,320,305                  -56,293   -0.24%
matmul::bench::bench_matmul_strassen                                7,263,017              7,302,339                    39,322    0.54%
mergesort::bench::merge_sort_par_bench                              9,946,218              9,980,256                    34,038    0.34%
mergesort::bench::merge_sort_seq_bench                              29,151,116             28,918,180                 -232,936   -0.80%
nbody::bench::nbody_par                                             13,329,584             13,264,579                  -65,005   -0.49%
nbody::bench::nbody_parreduce                                       17,687,703             17,342,219                 -345,484   -1.95%
nbody::bench::nbody_seq                                             27,689,546             27,238,169                 -451,377   -1.63%
pythagoras::euclid_faux_serial                                      37,292,303             36,833,565                 -458,738   -1.23%
pythagoras::euclid_parallel_full                                    68,520,504             74,284,420                5,763,916    8.41%
pythagoras::euclid_parallel_one                                     12,365,070             12,456,163                   91,093    0.74%
pythagoras::euclid_parallel_outer                                   14,338,422             12,212,634               -2,125,788  -14.83%
pythagoras::euclid_parallel_weightless                              14,391,746             12,486,010               -1,905,736  -13.24%
pythagoras::euclid_serial                                           32,313,978             30,744,513               -1,569,465   -4.86%
quicksort::bench::quick_sort_par_bench                              19,451,813             19,450,258                   -1,555   -0.01%
quicksort::bench::quick_sort_seq_bench                              44,328,073             43,814,982                 -513,091   -1.16%
quicksort::bench::quick_sort_splitter                               20,151,025             20,040,111                 -110,914   -0.55%
sieve::bench::sieve_chunks                                          10,696,847             10,642,678                  -54,169   -0.51%
sieve::bench::sieve_parallel                                        6,584,548              6,535,757                   -48,791   -0.74%
sieve::bench::sieve_serial                                          24,724,305             25,750,463                1,026,158    4.15%
sort::demo_merge_sort_ascending                                     185,286 (2158 MB/s)    187,414 (2134 MB/s)           2,128    1.15%
sort::demo_merge_sort_big                                           11,542,110 (554 MB/s)  11,688,242 (547 MB/s)       146,132    1.27%
sort::demo_merge_sort_descending                                    181,821 (2199 MB/s)    181,970 (2198 MB/s)             149    0.08%
sort::demo_merge_sort_mostly_ascending                              430,602 (928 MB/s)     429,643 (931 MB/s)             -959   -0.22%
sort::demo_merge_sort_mostly_descending                             444,437 (900 MB/s)     453,418 (882 MB/s)            8,981    2.02%
sort::demo_merge_sort_random                                        1,525,834 (262 MB/s)   1,492,009 (268 MB/s)        -33,825   -2.22%
sort::demo_merge_sort_strings                                       5,594,023 (143 MB/s)   5,738,056 (139 MB/s)        144,033    2.57%
sort::demo_quick_sort_big                                           8,686,314 (736 MB/s)   8,880,504 (720 MB/s)        194,190    2.24%
sort::demo_quick_sort_mostly_ascending                              16,092,594 (24 MB/s)   16,272,324 (24 MB/s)        179,730    1.12%
sort::demo_quick_sort_mostly_descending                             13,977,534 (28 MB/s)   14,164,738 (28 MB/s)        187,204    1.34%
sort::demo_quick_sort_random                                        1,735,842 (230 MB/s)   1,737,633 (230 MB/s)          1,791    0.10%
sort::demo_quick_sort_strings                                       7,883,157 (101 MB/s)   7,950,810 (100 MB/s)         67,653    0.86%
sort::par_sort_ascending                                            58,045 (6891 MB/s)     59,315 (6743 MB/s)            1,270    2.19%
sort::par_sort_big                                                  13,643,289 (469 MB/s)  13,735,840 (465 MB/s)        92,551    0.68%
sort::par_sort_descending                                           100,980 (3961 MB/s)    102,940 (3885 MB/s)           1,960    1.94%
sort::par_sort_expensive                                            65,216,966 (6 MB/s)    67,760,014 (5 MB/s)       2,543,048    3.90%
sort::par_sort_mostly_ascending                                     533,641 (749 MB/s)     545,684 (733 MB/s)           12,043    2.26%
sort::par_sort_mostly_descending                                    568,791 (703 MB/s)     576,828 (693 MB/s)            8,037    1.41%
sort::par_sort_random                                               1,321,249 (302 MB/s)   1,327,670 (301 MB/s)          6,421    0.49%
sort::par_sort_strings                                              5,178,110 (154 MB/s)   5,268,932 (151 MB/s)         90,822    1.75%
sort::par_sort_unstable_ascending                                   50,597 (7905 MB/s)     49,305 (8112 MB/s)           -1,292   -2.55%
sort::par_sort_unstable_big                                         7,252,490 (882 MB/s)   7,275,712 (879 MB/s)         23,222    0.32%
sort::par_sort_unstable_descending                                  70,803 (5649 MB/s)     70,910 (5640 MB/s)              107    0.15%
sort::par_sort_unstable_expensive                                   77,851,362 (5 MB/s)    74,893,160 (5 MB/s)      -2,958,202   -3.80%
sort::par_sort_unstable_mostly_ascending                            324,132 (1234 MB/s)    328,995 (1215 MB/s)           4,863    1.50%
sort::par_sort_unstable_mostly_descending                           337,748 (1184 MB/s)    343,293 (1165 MB/s)           5,545    1.64%
sort::par_sort_unstable_random                                      742,906 (538 MB/s)     749,185 (533 MB/s)            6,279    0.85%
sort::par_sort_unstable_strings                                     5,244,322 (152 MB/s)   5,340,895 (149 MB/s)         96,573    1.84%
str_split::parallel_space_char                                      925,504                958,081                      32,577    3.52%
str_split::parallel_space_fn                                        927,486                937,274                       9,788    1.06%
str_split::serial_space_char                                        2,197,436              2,246,500                    49,064    2.23%
str_split::serial_space_fn                                          2,213,163              2,293,431                    80,268    3.63%
str_split::serial_space_str                                         3,243,506              3,076,083                  -167,423   -5.16%
tsp::bench::dj10                                                    14,272,603             14,465,095                  192,492    1.35%
vec_collect::vec_i::with_collect                                    3,886,235              3,857,576                   -28,659   -0.74%
vec_collect::vec_i::with_collect_into                               3,868,265              3,901,340                    33,075    0.86%
vec_collect::vec_i::with_collect_into_reused                        2,971,978              2,763,152                  -208,826   -7.03%
vec_collect::vec_i::with_fold                                       51,536,456             51,691,862                  155,406    0.30%
vec_collect::vec_i::with_linked_list_collect_vec                    38,760,202             38,649,487                 -110,715   -0.29%
vec_collect::vec_i::with_linked_list_collect_vec_sized              34,325,339             34,193,745                 -131,594   -0.38%
vec_collect::vec_i::with_linked_list_map_reduce_vec_sized           24,855,839             25,013,004                  157,165    0.63%
vec_collect::vec_i::with_vec_vec_sized                              24,839,590             25,091,431                  251,841    1.01%
vec_collect::vec_i_filtered::with_collect                           24,828,206             25,088,751                  260,545    1.05%
vec_collect::vec_i_filtered::with_fold                              52,184,645             50,886,762               -1,297,883   -2.49%
vec_collect::vec_i_filtered::with_linked_list_collect_vec           38,090,559             38,158,751                   68,192    0.18%
vec_collect::vec_i_filtered::with_linked_list_collect_vec_sized     33,969,414             33,543,078                 -426,336   -1.26%
vec_collect::vec_i_filtered::with_linked_list_map_reduce_vec_sized  24,860,997             24,867,880                    6,883    0.03%
vec_collect::vec_i_filtered::with_vec_vec_sized                     24,946,638             25,009,585                   62,947    0.25%

There are some wins and some lossses, but overall the peformance is basically the same. The implementation of crossbeam-deque (and crossbeam-epoch) is largely based on coco, so this isn't surprising.

Next year we'll most probably switch from epoch-based to hazard pointer-based memory reclamation, which should give us stricter guarantees on garbage collection and performance improvements (which will be visible in Rayon's benchmarks, I believe).

r? @cuviper

@@ -114,11 +114,17 @@ impl<'a> Drop for Terminator<'a> {

impl Registry {
pub fn new(mut configuration: Configuration) -> Result<Arc<Registry>, Box<Error>> {
const MIN_DEQUE_CAPACITY: usize = 1000;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This number is made up - I just chose one that seemed reasonable.

@cuviper cuviper self-assigned this Nov 27, 2017
@cuviper
Copy link
Member

cuviper commented Nov 27, 2017

What's your Rust version policy for crossbeam-deque? Your CI is only testing stable/beta/nightly.

I try really hard not to bump rustc, as this implicit dependency is not handled well (or at all) by Cargo. For rayon we could still bump semver to reflect a new requirement, but for rayon-core we don't want to bump semver at all.

We currently support Rust 1.12, and this change requires 1.21 to resolve the errors below. While I'm probably going to have to swallow a rayon-core update at some point, this seems way too recent to force on all rayon users.

With 1.12 and 1.13:

error: expected `{`, found `*`
   --> /home/jistone/.cargo/registry/src/github.com-1ecc6299db9ec823/memoffset-0.1.0/src/span_of.rs:138:11
    |
138 |     use ::*;
    |           ^

error: aborting due to previous error

error: Could not compile `memoffset`.

With ..= 1.20:

error[E0277]: the trait bound `[u8; 64]: core::clone::Clone` is not satisfied
  --> /home/jistone/.cargo/registry/src/github.com-1ecc6299db9ec823/crossbeam-utils-0.2.1/src/cache_padded.rs:32:13
   |
32 |             bytes: [u8; 64],
   |             ^^^^^^^^^^^^^^^ the trait `core::clone::Clone` is not implemented for `[u8; 64]`
   |
   = help: the following implementations were found:
             <[T; 24] as core::clone::Clone>
             <[T; 6] as core::clone::Clone>
             <[T; 16] as core::clone::Clone>
             <[T; 0] as core::clone::Clone>
           and 29 others
   = note: required by `core::clone::Clone::clone`

error[E0277]: the trait bound `T: core::marker::Copy` is not satisfied
  --> /home/jistone/.cargo/registry/src/github.com-1ecc6299db9ec823/crossbeam-utils-0.2.1/src/cache_padded.rs:36:13
   |
36 |             _marker: ([T; 0], PhantomData<T>),
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `core::marker::Copy` is not implemented for `T`
   |
   = help: consider adding a `where T: core::marker::Copy` bound
   = note: required because of the requirements on the impl of `core::clone::Clone` for `[T; 0]`
   = note: required because of the requirements on the impl of `core::clone::Clone` for `([T; 0], core::marker::PhantomData<T>)`
   = note: required by `core::clone::Clone::clone`

error: aborting due to 2 previous errors

error: Could not compile `crossbeam-utils`.

@ghost
Copy link
Author

ghost commented Nov 27, 2017

We don't have any particular version policy at the moment. But since you're supporting Rust 1.12, I guess we'll have to support it as well. :)

@jeehoonkang
Copy link

jeehoonkang commented Jan 14, 2018

@cuviper Thank you for considering to use the new Crossbeam!

While I'm probably going to have to swallow a rayon-core update at some point, this seems way too recent to force on all rayon users.

May I ask when do you think the semver of rayon-core will be bumped? Because some libraries on which Crossbeam depends use the ? operator, which was introduced in Rust 1.13. If the semver of rayon-core will be bumped soon, and you agree to require Rust 1.13 or later, it's maybe best to wait for that event. If it's far away, maybe we can use a local fork for each library.

@cuviper
Copy link
Member

cuviper commented Jan 14, 2018

@jeehoonkang

If the semver of rayon-core will be bumped soon

We hope to never bump the major semver of rayon-core, as the goal is to only ever have one implementation of rayon threadpools in a process. So at most, we'll treat raising the minimum rustc as a minor semver bump. I don't love that, but we can't do much else until cargo understands rustc as a dependency.

That said, bumping our requirement from 1.12 to 1.13 doesn't seem too bad. For instance, the oldest rustc I know in an active distro is rustc-1.14 in Debian stretch. The other angle comes from dependent crates that also do CI for old versions -- e.g. cc checks 1.13. I don't know if any are checking 1.12.

@nikomatsakis
Copy link
Member

nikomatsakis commented Jan 23, 2018

I think we should consider bumping to get the ? operator in any case, but I agree it'd be nice if the minimal dependencies on the rust toolchain were handled more elegantly by cargo (though precisely what that means is an interesting question -- but this is not the forum for it).

@nikomatsakis
Copy link
Member

I think we should bump to 1.13.0 (or 1.14.0) as an interim measure.

However, longer term, we can't support some fixed release of Rust indefinitely. I think ultimately our policy should be that we support nightly (modulo bugs), stable, and LTS builds of Rust. Older than that we do not support. Of course, Rust doesn't have LTS builds, so at the moment we can't say that. We could define for ourselves what LTS means -- e.g., stable - N releases -- but I'd rather push for a "ecosystem-wide" standard that we can latch on to.

@torkleyy
Copy link
Contributor

Might be worth looking at how clap handles rustc version bumps: https://github.com/kbknapp/clap-rs#compatibility-policy

@jeehoonkang
Copy link

FYI, a Crossbeam PR for supporting Rust 1.13 is waiting to be reviewed: crossbeam-rs/crossbeam-epoch#61 After it is merged, we can release a new version of crossbeam-epoch and crossbeam-deque so that rayon-core can depend on. @stjepang do you have any thoughts?

@cuviper
Copy link
Member

cuviper commented Jan 23, 2018

We could define for ourselves what LTS means -- e.g., stable - N releases -- but I'd rather push for a "ecosystem-wide" standard that we can latch on to.

Some crates are using stable-2, but IMO that's way too short. I'd rather it be more like a year's worth.

Also, we can separate this concern between rayon-core and rayon. If you want to bump the latter to something more recent, let's definitely do that before rayon 1.0.

@nikomatsakis
Copy link
Member

@cuviper interesting point re: rayon vs rayon-core.

@jeehoonkang
Copy link

jeehoonkang commented Feb 10, 2018

We Crossbeam developers decided to dedicate crossbeam-deque 0.3.0 0.2.0 and crossbeam-epoch 0.4.0 0.3.0 for Rust 1.13. Here is my branch of Rayon that cargo +1.13.0 build succeeds: https://github.com/jeehoonkang/rayon/tree/crossbeam-deque-switch . Please review and merge this branch if you plan to bump the minimum Rust version to 1.13.

I'd like to thank @stjepang and @Vtec234 for their contributions to this effort!

@cuviper
Copy link
Member

cuviper commented Feb 12, 2018

We Crossbeam developers decided to dedicate crossbeam-deque 0.3.0 and crossbeam-epoch 0.4.0 for Rust 1.13.

I assume you mean the prior versions, crossbeam-deque 0.2.0 and crossbeam-epoch 0.3.0? At least, that's what it appears from the crossbeam-epoch 0.4.0 changelog, "Remove support for Rust 1.13."

@jeehoonkang
Copy link

@cuviper Thanks for correction! I updated my comment.

bors bot added a commit that referenced this pull request Feb 14, 2018
528: Replace coco with crossbeam-deque r=nikomatsakis a=cuviper

These are the changes from @stjepang and @jeehoonkang, replacing and closing #480.  The minimum rustc is *slightly* increased from 1.12 to 1.13 for the transitive requirements.

530: Add examples to par_split_mut and par_chunks_mut r=nikomatsakis a=cuviper

Also add an odd tail to the `par_chunks` example.

cc #420
@cuviper
Copy link
Member

cuviper commented Feb 15, 2018

Merged in #528, thanks!

@cuviper cuviper closed this Feb 15, 2018
@ghost ghost deleted the crossbeam-deque-switch branch February 15, 2018 18:12
@cuviper cuviper mentioned this pull request Mar 12, 2018
cuviper added a commit that referenced this pull request Mar 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants