Skip to content

Commit

Permalink
Rollup merge of #127275 - RalfJung:offset-from-isize-min, r=Amanieu
Browse files Browse the repository at this point in the history
offset_from, offset: clearly separate safety requirements the user needs to prove from corollaries that automatically follow

By landing #116675 we decided that objects larger than `isize::MAX` cannot exist in the address space of a Rust program, which lets us simplify these rules.

For `offset_from`, we can even state that the *absolute* distance fits into an `isize`, and therefore exclude `isize::MIN`. This PR also changes Miri to treat an `isize::MIN` difference like the other isize-overflowing cases.
  • Loading branch information
matthiaskrgr authored Jul 6, 2024
2 parents 28cc0b6 + 9ba492f commit 2137d19
Show file tree
Hide file tree
Showing 6 changed files with 139 additions and 272 deletions.
4 changes: 2 additions & 2 deletions compiler/rustc_const_eval/src/interpret/intrinsics.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,9 +301,9 @@ impl<'tcx, M: Machine<'tcx>> InterpCx<'tcx, M> {
}
// The signed form of the intrinsic allows this. If we interpret the
// difference as isize, we'll get the proper signed difference. If that
// seems *positive*, they were more than isize::MAX apart.
// seems *positive* or equal to isize::MIN, they were more than isize::MAX apart.
let dist = val.to_target_isize(self)?;
if dist >= 0 {
if dist >= 0 || i128::from(dist) == self.pointer_size().signed_int_min() {
throw_ub_custom!(
fluent::const_eval_offset_from_underflow,
name = intrinsic_name,
Expand Down
126 changes: 38 additions & 88 deletions library/core/src/ptr/const_ptr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -390,37 +390,26 @@ impl<T: ?Sized> *const T {
if self.is_null() { None } else { Some(unsafe { &*(self as *const MaybeUninit<T>) }) }
}

/// Calculates the offset from a pointer.
/// Adds an offset to a pointer.
///
/// `count` is in units of T; e.g., a `count` of 3 represents a pointer
/// offset of `3 * size_of::<T>()` bytes.
///
/// # Safety
///
/// If any of the following conditions are violated, the result is Undefined
/// Behavior:
/// If any of the following conditions are violated, the result is Undefined Behavior:
///
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
/// pointer must be either in bounds or at the end of the same [allocated object].
/// (If it is zero, then the function is always well-defined.)
/// * The computed offset, `count * size_of::<T>()` bytes, must not overflow `isize`.
///
/// * The computed offset, **in bytes**, cannot overflow an `isize`.
/// * If the computed offset is non-zero, then `self` must be derived from a pointer to some
/// [allocated object], and the entire memory range between `self` and the result must be in
/// bounds of that allocated object. In particular, this range must not "wrap around" the edge
/// of the address space.
///
/// * The offset being in bounds cannot rely on "wrapping around" the address
/// space. That is, the infinite-precision sum, **in bytes** must fit in a usize.
///
/// The compiler and standard library generally tries to ensure allocations
/// never reach a size where an offset is a concern. For instance, `Vec`
/// and `Box` ensure they never allocate more than `isize::MAX` bytes, so
/// `vec.as_ptr().add(vec.len())` is always safe.
///
/// Most platforms fundamentally can't even construct such an allocation.
/// For instance, no known 64-bit platform can ever serve a request
/// for 2<sup>63</sup> bytes due to page-table limitations or splitting the address space.
/// However, some 32-bit and 16-bit platforms may successfully serve a request for
/// more than `isize::MAX` bytes with things like Physical Address
/// Extension. As such, memory acquired directly from allocators or memory
/// mapped files *may* be too large to handle with this function.
/// Allocated objects can never be larger than `isize::MAX` bytes, so if the computed offset
/// stays in bounds of the allocated object, it is guaranteed to satisfy the first requirement.
/// This implies, for instance, that `vec.as_ptr().add(vec.len())` (for `vec: Vec<T>`) is always
/// safe.
///
/// Consider using [`wrapping_offset`] instead if these constraints are
/// difficult to satisfy. The only advantage of this method is that it
Expand Down Expand Up @@ -611,8 +600,7 @@ impl<T: ?Sized> *const T {
///
/// # Safety
///
/// If any of the following conditions are violated, the result is Undefined
/// Behavior:
/// If any of the following conditions are violated, the result is Undefined Behavior:
///
/// * `self` and `origin` must either
///
Expand All @@ -623,26 +611,10 @@ impl<T: ?Sized> *const T {
/// * The distance between the pointers, in bytes, must be an exact multiple
/// of the size of `T`.
///
/// * The distance between the pointers, **in bytes**, cannot overflow an `isize`.
///
/// * The distance being in bounds cannot rely on "wrapping around" the address space.
///
/// Rust types are never larger than `isize::MAX` and Rust allocations never wrap around the
/// address space, so two pointers within some value of any Rust type `T` will always satisfy
/// the last two conditions. The standard library also generally ensures that allocations
/// never reach a size where an offset is a concern. For instance, `Vec` and `Box` ensure they
/// never allocate more than `isize::MAX` bytes, so `ptr_into_vec.offset_from(vec.as_ptr())`
/// always satisfies the last two conditions.
///
/// Most platforms fundamentally can't even construct such a large allocation.
/// For instance, no known 64-bit platform can ever serve a request
/// for 2<sup>63</sup> bytes due to page-table limitations or splitting the address space.
/// However, some 32-bit and 16-bit platforms may successfully serve a request for
/// more than `isize::MAX` bytes with things like Physical Address
/// Extension. As such, memory acquired directly from allocators or memory
/// mapped files *may* be too large to handle with this function.
/// (Note that [`offset`] and [`add`] also have a similar limitation and hence cannot be used on
/// such large allocations either.)
/// As a consequence, the absolute distance between the pointers, in bytes, computed on
/// mathematical integers (without "wrapping around"), cannot overflow an `isize`. This is
/// implied by the in-bounds requirement, and the fact that no allocated object can be larger
/// than `isize::MAX` bytes.
///
/// The requirement for pointers to be derived from the same allocated object is primarily
/// needed for `const`-compatibility: the distance between pointers into *different* allocated
Expand Down Expand Up @@ -879,37 +851,26 @@ impl<T: ?Sized> *const T {
}
}

/// Calculates the offset from a pointer (convenience for `.offset(count as isize)`).
/// Adds an offset to a pointer (convenience for `.offset(count as isize)`).
///
/// `count` is in units of T; e.g., a `count` of 3 represents a pointer
/// offset of `3 * size_of::<T>()` bytes.
///
/// # Safety
///
/// If any of the following conditions are violated, the result is Undefined
/// Behavior:
/// If any of the following conditions are violated, the result is Undefined Behavior:
///
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
/// pointer must be either in bounds or at the end of the same [allocated object].
/// (If it is zero, then the function is always well-defined.)
/// * The computed offset, `count * size_of::<T>()` bytes, must not overflow `isize`.
///
/// * The computed offset, **in bytes**, cannot overflow an `isize`.
/// * If the computed offset is non-zero, then `self` must be derived from a pointer to some
/// [allocated object], and the entire memory range between `self` and the result must be in
/// bounds of that allocated object. In particular, this range must not "wrap around" the edge
/// of the address space.
///
/// * The offset being in bounds cannot rely on "wrapping around" the address
/// space. That is, the infinite-precision sum must fit in a `usize`.
///
/// The compiler and standard library generally tries to ensure allocations
/// never reach a size where an offset is a concern. For instance, `Vec`
/// and `Box` ensure they never allocate more than `isize::MAX` bytes, so
/// `vec.as_ptr().add(vec.len())` is always safe.
///
/// Most platforms fundamentally can't even construct such an allocation.
/// For instance, no known 64-bit platform can ever serve a request
/// for 2<sup>63</sup> bytes due to page-table limitations or splitting the address space.
/// However, some 32-bit and 16-bit platforms may successfully serve a request for
/// more than `isize::MAX` bytes with things like Physical Address
/// Extension. As such, memory acquired directly from allocators or memory
/// mapped files *may* be too large to handle with this function.
/// Allocated objects can never be larger than `isize::MAX` bytes, so if the computed offset
/// stays in bounds of the allocated object, it is guaranteed to satisfy the first requirement.
/// This implies, for instance, that `vec.as_ptr().add(vec.len())` (for `vec: Vec<T>`) is always
/// safe.
///
/// Consider using [`wrapping_add`] instead if these constraints are
/// difficult to satisfy. The only advantage of this method is that it
Expand Down Expand Up @@ -963,38 +924,27 @@ impl<T: ?Sized> *const T {
unsafe { self.cast::<u8>().add(count).with_metadata_of(self) }
}

/// Calculates the offset from a pointer (convenience for
/// Subtracts an offset from a pointer (convenience for
/// `.offset((count as isize).wrapping_neg())`).
///
/// `count` is in units of T; e.g., a `count` of 3 represents a pointer
/// offset of `3 * size_of::<T>()` bytes.
///
/// # Safety
///
/// If any of the following conditions are violated, the result is Undefined
/// Behavior:
///
/// * If the computed offset, **in bytes**, is non-zero, then both the starting and resulting
/// pointer must be either in bounds or at the end of the same [allocated object].
/// (If it is zero, then the function is always well-defined.)
///
/// * The computed offset cannot exceed `isize::MAX` **bytes**.
/// If any of the following conditions are violated, the result is Undefined Behavior:
///
/// * The offset being in bounds cannot rely on "wrapping around" the address
/// space. That is, the infinite-precision sum must fit in a usize.
/// * The computed offset, `count * size_of::<T>()` bytes, must not overflow `isize`.
///
/// The compiler and standard library generally tries to ensure allocations
/// never reach a size where an offset is a concern. For instance, `Vec`
/// and `Box` ensure they never allocate more than `isize::MAX` bytes, so
/// `vec.as_ptr().add(vec.len()).sub(vec.len())` is always safe.
/// * If the computed offset is non-zero, then `self` must be derived from a pointer to some
/// [allocated object], and the entire memory range between `self` and the result must be in
/// bounds of that allocated object. In particular, this range must not "wrap around" the edge
/// of the address space.
///
/// Most platforms fundamentally can't even construct such an allocation.
/// For instance, no known 64-bit platform can ever serve a request
/// for 2<sup>63</sup> bytes due to page-table limitations or splitting the address space.
/// However, some 32-bit and 16-bit platforms may successfully serve a request for
/// more than `isize::MAX` bytes with things like Physical Address
/// Extension. As such, memory acquired directly from allocators or memory
/// mapped files *may* be too large to handle with this function.
/// Allocated objects can never be larger than `isize::MAX` bytes, so if the computed offset
/// stays in bounds of the allocated object, it is guaranteed to satisfy the first requirement.
/// This implies, for instance, that `vec.as_ptr().add(vec.len())` (for `vec: Vec<T>`) is always
/// safe.
///
/// Consider using [`wrapping_sub`] instead if these constraints are
/// difficult to satisfy. The only advantage of this method is that it
Expand Down
Loading

0 comments on commit 2137d19

Please sign in to comment.