-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-zeroing dynamic drops #320
Conversation
This looks good. It removes the two worst costs (zeroing and injected flags). Having the flags on the stack shouldn't be so bad. Most of the time no flags would even need to be created, right? Question: I might have missed this in the RFC but, will it be possible to query the state of the drop flags, in a standard way, from the function/scope encompassing the droppable values? |
@arcto no, the RFC does not provide a querying primitive. Note that such a primitive is not terribly useful, at least not without more significant language changes, since if any branch moves a path like |
Dynamic drop is probably a bad idea, because it means that some code cannot be moved out to a function without altering behavior (since the hidden flag cannot be returned from the function), which is a sign of a misdesigned language feature. One can just use Option to get the same effects, while also allowing to move code into functions (by passing an &mut Option, which the function can leave unchanged or set to None depending on control flow). |
@bill-myers could you give an example of a case where that problem would come up? |
It's required to provide scope-based RAII like D and C++. The lifetime of variables being tied to scopes of what nearly everyone will expect since that's also what the lifetime system assumes and is how it works in other languages. Extending variable lifetimes on borrows like the eager drop proposal would change them from a compile-time check to a feature with runtime impact along with making correct low-level code more difficult to write. The need for a dynamic drop flag will only occur when there is a conditional early drop and it can be avoided by adding an explicit |
The description seems slightly over-complicated: we could add drop flags for every (relevant) path, and then do value propagation and dead store elimination to remove the unneeded ones. Maybe we can even get LLVM to do these optimizations for us. |
@arielb1 It is probably better to treat the optimization separately, but it is probably important to avoid generating drop flags for every single move that occurs. That will only bloat the peak memory usage of |
The RFC doesn't need to describe the optimization though, elision of the flag is an implementation detail. |
Feasibility of implementation is important when considering an RFC, and if @pnkfelix already did the work in another context he might as well describe it. |
It also also necessary for the dynamic drop lint to work. While I still prefer eager drop, this does solve all the practical problems with the drop flag. Good work, pnkfelix. |
Its true that the RFC may not have needed to spec out so much detail. But the crucial thing is that the size of the set of drop obligations is bounded by the size of the fn definition in the program text. (As opposed to a more naive approach where you would first recursively descend the type structure all the way down until you hit an actual |
@arcto, wouldn't that be a compiler error? The compiler won't yet you use an object that might already have been moved. |
@arcto: A destructor isn't the only reason for a type moving ownership on shallow copies. The complexity is unnecessary anyway because types that don't move are still usable after a shallow copy. |
@rkjsns I don't know, is it? It's very dangerous at least! @thestinger I just used the name for the macro from the flag name in the RFC. Never mind. I don't know what I was thinking. It can't be allowed. |
should be equivalent to:
since this is the natural "introduce a function" transformation that should not alter behavior in a well-designed language. However, with dynamic drop the first snippet will not drop x until c() is executed if a() returns false, while the second snippet always drops x before c() is executed, and thus language behavior is not preserved. It's also not particularly intuitive that this problem even exists, and not trivial to check manually for complicated code. The root of this problem is of course the use of "injected boolean flags for the drop obligations for the function" rather than a first class mechanism that can be passed across functions, which already exists (for non-partial moves) in the form of the "Option" type, which provides exactly such a boolean flag, but as a first-class mechanism. |
@bill-myers: You're ignoring that functions introduce a new scope. If you do the same inside a function by moving a variable into the new nested scope it will be no different. Having scope-based RAII doesn't make a programming language poorly designed. |
@thestinger Functions of course introduce a new scope, but with static drop, behavior is unchanged as long as you move exactly the variables that are consumed, by passing them by value as parameters, which is the natural thing to do; with dynamic drop you can't do that because some variables are "conditionally consumed" and you can't move them at all. Also it's of course possible to have "scope-based RAII" with static drop: that's what C++ does for instance. |
You haven't demonstrated any difference between moving variables into a function scope or a local scope in a function. There isn't a difference between the same code in a local function, but your example isn't using the equivalent code as you're missing a scope and a move into it for the non-split example.
C++ move semantics are implemented with dynamic drop flags. Using a flag on the stack instead of the space provided by the type isn't significantly different. The drop flags were always an implementation detail and code relying on the zeroing within data structures had undefined behaviour already. This proposal preserves the same semantics that exist today, which are also the same in C++. |
@thestinger There is indeed no difference, but that's unrelated to the issue of whether code can be moved out into a new function without changing the rest of the function (or I guess one could say that the difference is that with a new function everything goes out of scope, including the implicit drop flags which can't be moved into the new scope because they are implicit). Also, as far as I know, C++11 does not have "move" in the Rust sense at all: "moving out" of an variable just leaves it potentially logically empty (if a custom move constructor empties it), and the destructor is still run normally and there are no language-level drop flags; there is also in fact no concept of dropping anything early at all (you can manually call the destructor, but that will in general crash the program unless you undo it with placement new or equivalent, since the compiler will invoke the destructor again itself when the variable goes out of scope). |
It freaks me out a little bit that an object becomes undead after a conditional block in which it might be consumed. You can't interact with it, but still its In @bill-myers example: is it viable that |
Code can be moved into a new function without changing behaviour. It does have to be scoped the same way to be equivalent, but that also applies to scope-based lifetimes.
It does have a move in the same sense as Rust. In both languages, the variable may only be destroyed or re-initialized after it has been moved from, which is also true in Rust. Variables are destroyed in reverse order at the end of a block, but there is no effect if they have been moved from. It's incorrect to deviate from that behaviour in C++ and the compiler is explicitly permitted to make the assumption that move and copy constructors only do what they say on the tin. C++ does use dynamic drop flags to implement that behaviour, although it's an implementation detail for the standard library just like it is in Rust. User code has no choice but to do it that way. |
You're describing the semantics of scope-based resource management / RAII. The destructor runs at the end of the scope, unless ownership of the value has been moved out of the variable.
I don't think you understand the proposal. It doesn't change the semantics from what they are today. The destructor runs at the end of the scope unless the variable has been moved from, in which case it does not run. There is nothing The proposal is a conservative improvement over what exists today. There is really no need for an RFC at all because the currently defined semantics are preserved. It was already undefined to depend on the drop flag within values, so this is just an implementation optimization. There is no language change here. |
@thestinger the argument for why an RFC is necessary is that libraries may be relying on the zero'ing behavior. |
Those libraries have undefined behaviour. Low-level compiler implementation details aren't something that |
@bill-myers Your proposed behaviour would make it very difficult to reason locally about ownership. If I see something passed by-value into a function, I don't want to have to read that function (and any function called by it) to try to reason whether the value has actually been moved out of this function. That would be super impractical and annoying. Functions aren't macros that just paste code blocks for you, they're behaviour boundaries. |
cc me |
Yeah, that's the problem: if you introduce a function, there is no way to do so without changing semantics by making the destructor always run earlier. Let me try to restate this: in most languages, ANY scoped code block (i.e. code between "{" and "}" within a function) can be moved to a newly introduced function with simple changes, replacing the code block with a function call and possibly a match statement that performs any non-local jumps (e.g. "return", break, goto outside the block, etc.) in the parent function. Obviously this also holds for any sequence of statements that can be surrounded by "{" and "}" without changing semantics (i.e. those that either don't introduce variables or other bindings or are immediately followed by a "}"). With dynamic drop, this is only true if, for all variables, either all execution paths through the code block leave the variable in a moved out state or if none of them leaves the variable in a moved out state. Otherwise, you need to change the variable to a "let mut", move it into the function, return it back wrapped into an Option, and add a match statement in the parent that sets back the variable if the Option is Some. I.e. something like this:
If you introduce a function in a place where behavior is changed without making that transformation you get no compiler warning or error at all, but rather just a silent and likely unexpected change of semantics (and there's no way to produce such a warning or error unless you pass both versions of the source code to the compiler). Of course, it's possible that any potential convenience of dynamic drop outweighs the loss of the ability to easily move any block to a new function, but it seems pretty obvious that the issue exists. [EDIT: I realized it's actually possible to move the code to a function, but it requires to return it back as an Option and conditionally reset it] |
As long as it's backwards compatible with allowing it in the future (even if we never do), I think it's a great compromise to make for the time being. |
Not sure if it affects this RFC, but regarding @bill-myers point, is it technically possible for a tool to implement an |
It is an open question whether this can actually be added post 1.0 without breaking clients, even with the changes to the language I outlined above. @arielb1 has made an internals thread here: http://internals.rust-lang.org/t/backwards-compatability-issue-with-non-zeroing-drop/1525 |
@pnkfelix I think it's clear that this change will break some unsafe code. Then again, everything breaks some unsafe code. I'm being a bit flippant, but I think if we advertise our intentions here and make it clear that it is wrong to rely on zeroing, then we are within our rights to break unsafe code that relies on it. It'd be great to get it implemented for 1.0 but I don't consider it a strict requirement. |
(Also, this underscores the need for a document laying out what kinds of unsafe code are considered "stable" and what is off bounds, but that's a separate problem from this RFC in particular.) |
We've decided to accept this RFC. Dynamic drop has long been recognized as the implementation strategy of choice. The reasoning for this is documented clearly in the RFC itself, but the primary points in favor of this strategy as compared to static drop semantics were:
|
Tracking issue: rust-lang/rust#5016 |
What's the decision for #[unsafe_no_drop_flag] ? |
@arthurprs my suspicion is that the (The other option I have considered is to continue respecting it, but instead of controlling whether the type has an extra flag attached to it directly, it would instead control whether you put a flag onto the stack for that type.) Either approach can be implemented after we have gotten direct experience with non-zeroing drop. Update:
|
What about intrinsics for manipulating these flags? Useful patterns like https://github.com/reem/rust-replace-map/blob/master/src/lib.rs will need to be made unsafe unless there are intrinsics for manipulating this system. |
what replace_map is doing is nonlocal so I don't think it can be ported? By nonlocal I mean that zeroing is used to inhibit the drop in some arbitrary parent frame where |
On Mon, Feb 09, 2015 at 05:44:00PM -0800, Jonathan Reem wrote:
It would certainly be helpful if you could point at a specific line, The obvious replacement to this pattern is use an Of course, ReplaceMap could probably handle this another way. I |
I've a couple questions about the current state of memory zeroing: Rust currently still zeros any non-
Although actually this test fails, even if we replace the How best should we zero memory in a manual
Are there any situations where either the zeroing of non-drop traits or the |
@burdges Rust 1.13 and newer never writes to a memory location without user actions (initialization/assignment, etc.). This includes padding between struct fields and complex dataflow. |
I see, so anything that needs zeroing needs a drop method like the one I gave above. Thanks! |
As I understand, a If so, is there any interest in making it hard to call |
There was a very long discussion in RFC 1066 about this that effectively lead to making |
Rust does not zero non-`Drop` types when it drops them. Avoid leaking these type as doing so obstructs zeroing them. In particular, if you are working with secret key material then - do not call `::std::mem::forget`, - do not unsafely zero types with owning pointers, - ensure your code cannot panic. - take care with `Weak`, and - examine the data structures you use for violations of these rules. See rust-lang/rfcs#320 (comment) and https://github.com/isislovecruft/curve25519-dalek/issues/11
Updated RFC 297 with removal of codemod and additional design issues
Summary
Remove drop flags from values implementing
Drop
, and remove automatic memory zeroing associated with dropping values.Keep dynamic drop semantics, by having each function maintain a (potentially empty) set of auto-injected boolean flags for the drop obligations for the function that need to be tracked dynamically (which we will call "dynamic drop obligations").
text/
(rendered)