Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -Zrandomize-layout flag to better detect code that rely on unspecified behavior related to memory layout #77316

Closed
marmeladema opened this issue Sep 28, 2020 · 8 comments
Labels
-Zrandomize-layout Unstable option: Randomize the layout of types. A-layout Area: Memory layout of types C-feature-accepted Category: A feature request that has been accepted pending implementation. C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@marmeladema
Copy link
Contributor

marmeladema commented Sep 28, 2020

In order to better detect code that rely on unspecified behavior related to memory layout, it could be useful to have a -Zrandomize-layout flag that add some padding bytes at the start of all #[repr(rust)] structs and unions. It should also add different amounts of padding to different fields.

This should help detecting invalid transmutes, or invalid unions.

Zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/122651-general/topic/Transmute.20safety.20question

@jonas-schievink jonas-schievink added A-layout Area: Memory layout of types C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 28, 2020
@marmeladema marmeladema changed the title Add -Zrandomize-layout flag to better detect code that rely on unspecified behavior related memory layout Add -Zrandomize-layout flag to better detect code that rely on unspecified behavior related to memory layout Sep 28, 2020
@RalfJung
Copy link
Member

I think this could be very useful indeed... maybe @eddyb has some ideas for how to best implement this.

@eddyb
Copy link
Member

eddyb commented Sep 29, 2020

You can make this deterministic by using the def_path_hash for a given definition (it's an equivalent of DefId that's stable across incremental recompilation) - you might not even need to seed a PRNG with it if all you need is one value!

We can also do a limited version of this by default, but since it won't change the size, it would only fail at runtime #38550 (comment)

@eddyb
Copy link
Member

eddyb commented Sep 30, 2020

On Zulip @RalfJung pointed out that I didn't explain why you would want to make this deterministic (or why that's the main difficulty), so here's 3 reasons why this has to be deterministic:

  • unsizing coercions mean that generic types need to have the same field order across instantiations, when the field types match (ignoring the field that would be unsized) - so we can reorder based on alignment, but not randomize differently each time
  • layouts have to be the same for the same type across sibling crates (which do not know about eachother)
  • introducing non-determinism into the query system is unsound at least for incremental recompilation (where you will get miscompiles, i.e. potentially UB in safe code, if you have things like order-dependency or outright RNG, as the incremental dependency tracking will not know to invalidate queries that would now produce different results)

There's two ways to get deterministic per-type-definition "seeds":

  • generate a random number when compiling the definition (and use that one value for shuffling the layout of every instance in any crate) - this is still troublesome with incremental, you would have to add it explicitly as an incremental input, in order to be sound, and it will cause a lot of invalidation
  • use the def_path_hash (which I mentioned in the previous comment), which is effectively the TypeId hash corresponding to a DefId, and rely on its hash properties be pseudo-random (it should change with compiler releases and crate versions specified in the Cargo.toml, so at least we should get some variety - though maybe not as much on nightly sadly)

You might probably also want a CLI flag to e.g. XOR in a fixed seed as well, even with the latter method, but it would have to be embedded in the type definition (so that it's correct cross-crate without requiring the same flag to passed in order to avoid miscompilation), so it's as much effort as both.

@petrochenkov
Copy link
Contributor

petrochenkov commented Sep 30, 2020

Randomizing the layout for testing purposes was being suggested starting from the original struct layout RFC (https://github.com/rust-lang/rfcs/blob/master/text/0079-undefined-struct-layout.md#alternatives), it's unfortunate that it's still not implemented.

I've found one similar existing issue for this - #38550.

@elichai
Copy link
Contributor

elichai commented Oct 5, 2020

I think the "easiest" thing that should also give the highest reward is adding uninit bytes at the front of every repr(Rust) struct (obviously under some -Zpad-layout)
because this will:

  1. Break compilation for direct transmutes (different sizes).
  2. Will break anything that assumes specific size.
  3. Will make every direct read from a pointer UB because it will read uninitialized values, which Miri should catch.

we can then extend to:
A. Randomize the fields order.
B. Mess with alignment (can probably only be increased and not decreased).
C. Add uninit bytes at the middle/end? (not sure if this achieves anything)

@marmeladema
Copy link
Contributor Author

I would be glad to help implementing a first iteration of those ideas namely padding bytes at the start of repr(rust) struct. But I am not knowledgeable enough about those parts of the compiler.
Would someone be available to mentor me?

@eddyb
Copy link
Member

eddyb commented Oct 8, 2020

@marmeladema It's pretty much all in rustc_middle::ty::layout. Wherever field offsets are computed (univariant_uninterned), you can, for example, add the final alignment to all of the field offsets - this will increase the size of the whole type but keep all fields correctly aligned (and it's slightly simpler than changing field order, which has to be done similarly to how we sort fields by alignment).
You should be able to PM me if you have more questions.

If you rely on the same flag to be passed across all the crates, or if you record what the choices were when a certain crate was contained (you could look at how -Z symbol-mangling-version is handled), you should be able to avoid problems with cross-crate consistency, which as per #77316 (comment) was my main worry.

fee1-dead added a commit to fee1-dead-contrib/rust that referenced this issue Oct 1, 2021
Added -Z randomize-layout flag

An implementation of rust-lang#77316, it currently randomly shuffles the fields of `repr(rust)` types based on their `DefPathHash`
r? `@eddyb`
Manishearth added a commit to Manishearth/rust that referenced this issue Oct 1, 2021
Added -Z randomize-layout flag

An implementation of rust-lang#77316, it currently randomly shuffles the fields of `repr(rust)` types based on their `DefPathHash`
r? ``@eddyb``
@dtolnay dtolnay added the C-feature-accepted Category: A feature request that has been accepted pending implementation. label Jan 12, 2023
@dtolnay
Copy link
Member

dtolnay commented Jan 12, 2023

Closing feature request, as this has been implemented in #87868. The tracking issue is going to be #106764.

@dtolnay dtolnay closed this as completed Jan 12, 2023
@workingjubilee workingjubilee added the -Zrandomize-layout Unstable option: Randomize the layout of types. label Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-Zrandomize-layout Unstable option: Randomize the layout of types. A-layout Area: Memory layout of types C-feature-accepted Category: A feature request that has been accepted pending implementation. C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants