-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more vec![... ; n] optimizations #49496
Conversation
src/liballoc/vec.rs
Outdated
// a zeroed Option<T> is, indeed, None. | ||
// The compiler should be able to figure all this out statically at | ||
// compile time. | ||
let a: Option<T> = unsafe { mem::zeroed() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like undefined behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not entirely clear to me from the std::mem::zeroed doc that the way it's (ab)used here is actually UB. The value is explicitly not drop to avoid any definite UB, and is_none() is going to look at the discriminant, which will be zero, and return whether that value corresponds to the discriminant for None or not. And if None is not backed by a zero discriminant, it's because the compiler decided that zero was an interesting valid value for Some(something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a value that can't exist is inherently UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it not exist? None is either zero, or some other value because the compiler decided to allocate zero to Some(something). Conversely, that means zero is a valid value, but not necessarily None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, yes, arguably, zero for places other than the discriminant /might/ be invalid. But we're only ever looking at the discriminant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have an Option<&u32>
, the compiler is totally allowed to choose 1
as the special None
value rather than 0
. In that case, there is no value of type Option<&u32>
that has a bit-pattern that is entirely zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a realistic example, since Option<&u32>
's None
is 0, and fits in the same size as a pointer because &u32
itself is never 0, so the discriminant for Some()
is non-zero. AFAIK, the only reason the compiler would choose a non 0 discriminant for None
is if, like for Option<char>
, there's room for invalid T values in the bit representation of T, and 0 is a valid value of T.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&u32
is never 1 either. The compiler could choose any value that isn't a multiple of 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it did, that would mean statics initialized to None would not end up in .bss
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's completely irrelevant to the soundness of this code.
vec![0; n], via implementations of SpecFromElem, has an optimization that uses with_capacity_zeroed instead of with_capacity, which will use calloc instead of malloc, and avoid an extra memset. This adds the same optimization for vec![ptr::null(); n] and vec![ptr::null_mut(); n], assuming their bit value is 0 (which is true on all currently supported platforms). This does so by adding an intermediate trait IsZero, which looks very much like nonzero::Zeroable, but that one is on the way out, and doesn't apply to pointers anyways. Adding such a trait allows to avoid repeating the logic using with_capacity_zeroed or with_capacity, or making the macro more complex to support generics.
Similarly to vec![ptr::null{,_mut}(); n] in previous change, this adds the optimization for vec!['\0'; n].
Updated the PR with less controversial stuff. |
@bors r+ |
📌 Commit 0df837f has been approved by |
Add more vec![... ; n] optimizations vec![0; n], via implementations of SpecFromElem, has an optimization that uses with_capacity_zeroed instead of with_capacity, which will use calloc instead of malloc, and avoid an extra memset. This PR adds the same optimization for ptr::null, ptr::null_mut, and None, when their in-memory representation is zeroes.
Add more vec![... ; n] optimizations vec![0; n], via implementations of SpecFromElem, has an optimization that uses with_capacity_zeroed instead of with_capacity, which will use calloc instead of malloc, and avoid an extra memset. This PR adds the same optimization for ptr::null, ptr::null_mut, and None, when their in-memory representation is zeroes.
Add more vec![... ; n] optimizations vec![0; n], via implementations of SpecFromElem, has an optimization that uses with_capacity_zeroed instead of with_capacity, which will use calloc instead of malloc, and avoid an extra memset. This PR adds the same optimization for ptr::null, ptr::null_mut, and None, when their in-memory representation is zeroes.
Rollup of 9 pull requests Successful merges: - #48658 (Add a generic CAS loop to std::sync::Atomic*) - #49253 (Take the original extra-filename passed to a crate into account when resolving it as a dependency) - #49345 (RFC 2008: Finishing Touches) - #49432 (Flush executables to disk after linkage) - #49496 (Add more vec![... ; n] optimizations) - #49563 (add a dist builder to build rust-std components for the THUMB targets) - #49654 (Host compiler documentation: Include private items) - #49667 (Add more features to rust_2018_preview) - #49674 (ci: Remove x86_64-gnu-incremental builder) Failed merges:
rust-lang#49496 introduced specialization based on: ``` unsafe impl<T: ?Sized> IsZero for *mut T { fn is_zero(&self) -> bool { (*self).is_null() } } ``` … to call `RawVec::with_capacity_zeroed` for creating `Vec<*mut T>`, which is incorrect for fat pointers since `<*mut T>::is_null` only looks at the data component. That is, a fat pointer can be “null” without being made entirely of zero bits. This commit fixes it by removing the `?Sized` bound on this impl (and the corresponding `*const T` one). This regresses `vec![x; n]` with `x` a null raw slice of length zero, but that seems exceptionally uncommon. (Vtable pointers are never null, so raw trait objects would not take the fast path anyway. An alternative to keep the `?Sized` bound (or even generalize to `impl<U: Copy> IsZero for U`) would be to cast to `&[u8]` of length `size_of::<U>()`, but the optimizer seems not to be able to propagate alignment information and sticks with comparing one byte at a time: https://rust.godbolt.org/z/xQFkwL ---- Without the library change, the new test fails as follows: ``` ---- vec::vec_macro_repeating_null_raw_fat_pointer stdout ---- [src/liballoc/tests/vec.rs:1301] ptr_metadata(raw_dyn) = 0x00005596ef95f9a8 [src/liballoc/tests/vec.rs:1306] ptr_metadata(vec[0]) = 0x0000000000000000 thread 'vec::vec_macro_repeating_null_raw_fat_pointer' panicked at 'assertion failed: vec[0] == null_raw_dyn', src/liballoc/tests/vec.rs:1307:5 ```
…ackler Fix `vec![x; n]` with null raw fat pointer zeroing the pointer metadata rust-lang#49496 introduced specialization based on: ```rust unsafe impl<T: ?Sized> IsZero for *mut T { fn is_zero(&self) -> bool { (*self).is_null() } } ``` … to call `RawVec::with_capacity_zeroed` for creating `Vec<*mut T>`, which is incorrect for fat pointers since `<*mut T>::is_null` only looks at the data component. That is, a fat pointer can be “null” without being made entirely of zero bits. This commit fixes it by removing the `?Sized` bound on this impl (and the corresponding `*const T` one). This regresses `vec![x; n]` with `x` a null raw slice of length zero, but that seems exceptionally uncommon. (Vtable pointers are never null, so raw trait objects would not take the fast path anyway.) An alternative to keep the `?Sized` bound (or even generalize to `impl<U: Copy> IsZero for U`) would be to cast to `&[u8]` of length `size_of::<U>()`, but the optimizer seems not to be able to propagate alignment information and sticks with comparing one byte at a time: https://rust.godbolt.org/z/xQFkwL ---- Without the library change, the new test fails as follows: ```rust ---- vec::vec_macro_repeating_null_raw_fat_pointer stdout ---- [src/liballoc/tests/vec.rs:1301] ptr_metadata(raw_dyn) = 0x00005596ef95f9a8 [src/liballoc/tests/vec.rs:1306] ptr_metadata(vec[0]) = 0x0000000000000000 thread 'vec::vec_macro_repeating_null_raw_fat_pointer' panicked at 'assertion failed: vec[0] == null_raw_dyn', src/liballoc/tests/vec.rs:1307:5 ```
…ackler Fix `vec![x; n]` with null raw fat pointer zeroing the pointer metadata rust-lang#49496 introduced specialization based on: ```rust unsafe impl<T: ?Sized> IsZero for *mut T { fn is_zero(&self) -> bool { (*self).is_null() } } ``` … to call `RawVec::with_capacity_zeroed` for creating `Vec<*mut T>`, which is incorrect for fat pointers since `<*mut T>::is_null` only looks at the data component. That is, a fat pointer can be “null” without being made entirely of zero bits. This commit fixes it by removing the `?Sized` bound on this impl (and the corresponding `*const T` one). This regresses `vec![x; n]` with `x` a null raw slice of length zero, but that seems exceptionally uncommon. (Vtable pointers are never null, so raw trait objects would not take the fast path anyway.) An alternative to keep the `?Sized` bound (or even generalize to `impl<U: Copy> IsZero for U`) would be to cast to `&[u8]` of length `size_of::<U>()`, but the optimizer seems not to be able to propagate alignment information and sticks with comparing one byte at a time: https://rust.godbolt.org/z/xQFkwL ---- Without the library change, the new test fails as follows: ```rust ---- vec::vec_macro_repeating_null_raw_fat_pointer stdout ---- [src/liballoc/tests/vec.rs:1301] ptr_metadata(raw_dyn) = 0x00005596ef95f9a8 [src/liballoc/tests/vec.rs:1306] ptr_metadata(vec[0]) = 0x0000000000000000 thread 'vec::vec_macro_repeating_null_raw_fat_pointer' panicked at 'assertion failed: vec[0] == null_raw_dyn', src/liballoc/tests/vec.rs:1307:5 ```
vec![0; n], via implementations of SpecFromElem, has an optimization that uses with_capacity_zeroed instead of with_capacity, which will use calloc instead of malloc, and avoid an extra memset.
This PR adds the same optimization for ptr::null and ptr::null_mut, when their in-memory representation is zeroes.