Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor WorkerLocal for parallel compiler #109478

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions compiler/rustc_data_structures/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ indexmap = { version = "1.9.1" }
jobserver_crate = { version = "0.1.13", package = "jobserver" }
libc = "0.2"
measureme = "10.0.0"
rayon-core = { version = "0.4.0", package = "rustc-rayon-core", optional = true }
rayon-core = { version = "0.4.0", package = "rustc-rayon-core" }
rayon = { version = "0.4.0", package = "rustc-rayon", optional = true }
rustc_graphviz = { path = "../rustc_graphviz" }
rustc-hash = "1.1.0"
Expand Down Expand Up @@ -43,4 +43,4 @@ winapi = { version = "0.3", features = ["fileapi", "psapi", "winerror"] }
memmap2 = "0.2.1"

[features]
rustc_use_parallel_compiler = ["indexmap/rustc-rayon", "rayon", "rayon-core"]
rustc_use_parallel_compiler = ["indexmap/rustc-rayon", "rayon"]
75 changes: 46 additions & 29 deletions compiler/rustc_data_structures/src/sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
use crate::owning_ref::{Erased, OwningRef};
use std::collections::HashMap;
use std::hash::{BuildHasher, Hash};
use std::mem::MaybeUninit;
use std::ops::{Deref, DerefMut};
use std::panic::{catch_unwind, resume_unwind, AssertUnwindSafe};

Expand All @@ -30,6 +31,8 @@ pub use vec::AppendOnlyVec;

mod vec;

static PARALLEL: std::sync::atomic::AtomicBool = std::sync::atomic::AtomicBool::new(false);

cfg_if! {
if #[cfg(not(parallel_compiler))] {
pub auto trait Send {}
Expand Down Expand Up @@ -182,33 +185,6 @@ cfg_if! {

use std::cell::Cell;

#[derive(Debug)]
pub struct WorkerLocal<T>(OneThread<T>);

impl<T> WorkerLocal<T> {
/// Creates a new worker local where the `initial` closure computes the
/// value this worker local should take for each thread in the thread pool.
#[inline]
pub fn new<F: FnMut(usize) -> T>(mut f: F) -> WorkerLocal<T> {
WorkerLocal(OneThread::new(f(0)))
}

/// Returns the worker-local value for each thread
#[inline]
pub fn into_inner(self) -> Vec<T> {
vec![OneThread::into_inner(self.0)]
}
}

impl<T> Deref for WorkerLocal<T> {
type Target = T;

#[inline(always)]
fn deref(&self) -> &T {
&self.0
}
}

pub type MTRef<'a, T> = &'a mut T;

#[derive(Debug, Default)]
Expand Down Expand Up @@ -328,8 +304,6 @@ cfg_if! {
};
}

pub use rayon_core::WorkerLocal;

pub use rayon::iter::ParallelIterator;
use rayon::iter::IntoParallelIterator;

Expand Down Expand Up @@ -364,6 +338,49 @@ cfg_if! {
}
}

#[derive(Debug)]
pub struct WorkerLocal<T> {
single_thread: bool,
inner: T,
mt_inner: Option<rayon_core::WorkerLocal<T>>,
}

impl<T> WorkerLocal<T> {
/// Creates a new worker local where the `initial` closure computes the
/// value this worker local should take for each thread in the thread pool.
#[inline]
pub fn new<F: FnMut(usize) -> T>(mut f: F) -> WorkerLocal<T> {
if !PARALLEL.load(Ordering::Relaxed) {
WorkerLocal { single_thread: true, inner: f(0), mt_inner: None }
} else {
// Safety: `inner` would never be accessed when multiple threads
WorkerLocal {
single_thread: false,
inner: unsafe { MaybeUninit::uninit().assume_init() },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems dangerous...
I am not entirely sure if the semantics of this have been fixed yet, but reading https://github.com/rust-lang/unsafe-code-guidelines/blob/master/active_discussion/validity.md it seems like this would be UB under that (since the assignment does a typed copy at type T, which does not allow uninit).
Either way, this is certainly a pattern that is discouraged, and it seems like the compiler should set an example here...
It seems much better to use a union between inner and mt_inner here, since it is guaranteed to only access the right field. (Or even better, an enum, since single_thread then functions as a discriminant... which basically makes it a homegrown enum anyways)

Copy link
Contributor

@RossSmyth RossSmyth Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is very unsound because rustc emits noundef for almost all types. So this is immediately UB from LLVM's PoV, and the current thought for Rust rules (with no real thoughts on it not being UB in the future).

Copy link
Member Author

@SparrowLii SparrowLii Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense. I am a little worried about the efficiency of using enum or union, maybe it is better to use inner: Option<T> here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand what the efficiency problem of an enum or union would be - checking the discriminant of an enum should be basically the same as checking if self.single_thread {, right?
(Additionally, using an Option would add increase the size by adding the Options discriminant in addition to single_thread)
I have opened #109528 to test the performance of my suggestion.

Copy link
Member Author

@SparrowLii SparrowLii Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using enum will prevent LLVM from making the best optimizations in many cases. For example, the perf result of this commit: #101566 (comment)

And the use of union will cause the compiler to add a lot of stuffs that trigger unwind due to union access errors, which will also reduce the optimization effect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But Option is also an enum, so it should have the same effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use unwrap which is a const function when signle_thread so i guess it is relatively more efficient

mt_inner: Some(rayon_core::WorkerLocal::new(f)),
}
}
}

/// Returns the worker-local value for each thread
#[inline]
pub fn into_inner(self) -> Vec<T> {
if self.single_thread { vec![self.inner] } else { self.mt_inner.unwrap().into_inner() }
}
}

impl<T> Deref for WorkerLocal<T> {
type Target = T;

#[inline(always)]
fn deref(&self) -> &T {
if self.single_thread { &self.inner } else { self.mt_inner.as_ref().unwrap().deref() }
}
}

// Just for speed test
unsafe impl<T: Send> std::marker::Sync for WorkerLocal<T> {}

pub fn assert_sync<T: ?Sized + Sync>() {}
pub fn assert_send<T: ?Sized + Send>() {}
pub fn assert_send_val<T: ?Sized + Send>(_t: &T) {}
Expand Down