-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add as_ptr
and as_mut_ptr
inherent method to String
#97483
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs |
Before, they went through `&str` and `&mut str`, which created intermediary references, shrinking provenance to only the initialized parts. `Vec<T>` already has such inherent methods added in rust-lang#61114.
11d852c
to
0f3d1b8
Compare
And I just found another case where this would have been helpful, compact_str#100 |
@RalfJung you might be interested as well, since you added the method to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. However, since this changes the libs API surface, I cannot approve it -- let's wait for the @rust-lang/libs-api team.
/// Modifying the string may cause its buffer to be reallocated, | ||
/// which would also make any pointers to it invalid. | ||
/// | ||
/// The caller must also ensure that the memory the pointer (non-transitively) points to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// The caller must also ensure that the memory the pointer (non-transitively) points to | |
/// The caller must also ensure that the memory the pointer points to |
We know it points to u8
so we don't need the transitivity comment.
/// which would also make any pointers to it invalid. | ||
/// | ||
/// The caller must also ensure that the memory the pointer (non-transitively) points to | ||
/// is never written to (except inside an `UnsafeCell`) using this pointer or any pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// is never written to (except inside an `UnsafeCell`) using this pointer or any pointer | |
/// is never written to using this pointer or any pointer |
There is no UnsafeCell
here.
I'd definitely prefer if we could merge this unstably, though I'm not sure exactly what that would entail. If it's not much work though I think I'd prefer to gate this change on that functionality since it will help a lot in similar situations going forward. @rust-lang/compiler do y'all know how hard it would be to make it so we could merge this unstably without causing breakage? My guess is we'd need to change method resolution to try all non-nightly methods for all deref targets first before going back and considering resolving the method call with the nightly method which doesn't seem tooooo hard (she says naively). This also feels vaguely related to rust-lang/rfcs#3240 |
We discussed this in today's @rust-lang/libs-api meeting. Without suggesting that we have any consensus on the implied provenance handling here, we don't have objections to merging this unstably for now. However, since it doesn't sound like there's a way to do that yet, marking this as blocked. |
(Elaborating on the "implied provenance handling": these and the underlying |
I think the PR description understates the impact that this has on the API. I'm hoping that this helps explain all the other implications of implementing fn main() {
let mut s: Vec<u8> = Vec::with_capacity(10);
s.extend(b"hello");
let ptr = s.as_ptr(); // This pointer stays valid...
let other = s.as_mut_ptr(); // Across as as_mut_ptr()
let v = s; // Across a move of the original container
unsafe {
*other = b'b'; // And across a write to an aliasing pointer
dbg!(*ptr);
}
let mut s = String::with_capacity(10);
s.push_str("hello");
let ptr = s.as_ptr(); // This pointer...
let other = s.as_mut_ptr(); // Is invalidated here (though this case might be fixed, UCG 113)
let v = s; // But wouldn't be invalidated here (this invalidates other with field retagging)
unsafe {
*other = b'b'; // We can't even get a pointer to do this without invalidating `ptr`, and we
// can't do this write by casting `ptr` because it doesn't have write
// permission.
dbg!(*ptr);
}
} |
@Nilstrieb what happens when you apply an |
error: an associated function with this name may be added to the standard library in the future
--> compiler/rustc_codegen_llvm/src/debuginfo/metadata.rs:1499:25
|
1499 | vtable_name.as_ptr().cast(),
| ^^^^^^
|
= warning: once this associated item is added to the standard library, the ambiguity may cause an error or change in behavior!
= note: for more information, see issue #48919 <https://github.com/rust-lang/rust/issues/48919>
= help: call with fully qualified syntax `bitflags::core::str::<impl str>::as_ptr(...)` to keep using the current method
= help: add `#![feature(string_as_ptr)]` to the crate attributes to enable `std::string::String::as_ptr` |
#99898 should enable this to be unblocked :) |
(Nit: in terms of the Rust specification / abstract machine, they would not return the same values here, but values that differ in provenance. They just erase to the same bit pattern at runtime.) |
As a possible argument not to have these methods, Tree Borrows makes this pretty much unnecessary -- the code in #106593 is accepted with Tree Borrows. |
It's probably also fair to say that this is something we want to be allowed (at least I would want that :)) |
Now I worry about whether we can remove these inherent methods from |
That is probably lost hope. I guess we can say that they now exist because they are a tiny little less code than the deref path, making compilation faster :D |
Hm, maybe we should open this again, or at least track this in an issue? There are still examples which are UB according to Tree Borrows that we might want to allow: fn main() {
let mut s = String::with_capacity(10);
s.push('x');
let ptr = s.as_mut_ptr();
unsafe { ptr.write(0x20) };
let _ptr2 = s.as_mut_ptr(); // creates a slice that aliases with `ptr`.
// That is considered like a read access, and since `ptr` is derived from
// a mutable reference such a foreign read access invalidates `ptr`.
unsafe { ptr.write(97) }; // UB
println!("{s:?}");
} |
Before, they went through
&str
and&mut str
, which created intermediary references, shrinking provenance to only the initialized parts.Vec<T>
already has such inherent methods added in #61114.beef
ran into this in beef#47.The docs are mostly copied from
Vec::{as_ptr, as_mut_ptr}
, adapting the examples to something better fitting. The implementation simply forwards to the vec methods.I'm not entirely sure what the correct stability attributes are for this, but I think it needs to be insta-stable like the vec methods?