-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add debug flag -Z print-type-sizes
for instrumention type/variant sizes
#37770
Conversation
r? @arielb1 (rust_highfive has picked a reviewer for you, use r? to override) |
For example, applying this tool to the
So, we could reduce the space impact of non- Likewise, we could reduce the space impact of |
FWIW if someone is looking at this - please don't take this as a suggestion, most statements are |
Is there a way we could print the actual variant names instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice approach.
return; | ||
} | ||
|
||
let mut visitor = TypeVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we have a single seen
list for the entire program? (A: this is just a cache for the deduplication in session?).
tcx: TyCtxt<'a, 'tcx, 'tcx>, | ||
mir: &Mir<'tcx>) { | ||
if tcx.sess.err_count() > 0 { | ||
// compiling a broken program can obviously result in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment copy-pasta
overall_size: overall_size, | ||
variant_sizes: sizes, | ||
}; | ||
if !self.type_sizes.contains(&info) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
risk of O(n^2)
. Because we sort this anyway, why not use a HashSet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah true; at one time for some reason I thought there was value in preserving the order in which we encountered the types, but that's obviously not reflected in the current output.
} | ||
} | ||
|
||
pub fn sort_by_type_description(&mut self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would prefer this entire pipeline to go here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll combine it all into fn print_type_sizes
as part of the switch to an underlying HashSet
.
@@ -216,6 +216,13 @@ pub fn compile_input(sess: &Session, | |||
})?? | |||
}; | |||
|
|||
if sess.opts.debugging_opts.print_type_sizes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we do this in the "post-trans" closure a few lines above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The closure we pass into phase_3_run_analysis_passes
?
What is the benefit you see of putting it in there? Is there some internal instrumentation that needs to be aware of this print pass taking place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just looks better there. Also, then you could have the type size list on the global tcx.
impl Pass for GatherTypeSizesMir { | ||
} | ||
|
||
impl<'tcx> MirPassHook<'tcx> for GatherTypeSizesMir { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to implement MirPassHook
- you are not using it anywhere.
@@ -0,0 +1,12 @@ | |||
// All of the types that occur in this function are uninteresting, in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need copyright comments though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, I totally forgot the copyright comments; will fix.
self.seen.insert(ty); | ||
|
||
let reveal = Reveal::All; | ||
// let reveal = Reveal::NotSpecializable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: commented-out code in program.
Vec::new() | ||
} | ||
|
||
// RawNullablePointer/StructWrappedNullablePointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exhaustive match?
variant_sizes); | ||
} | ||
Err(err) => { | ||
self.tcx.sess.warn(&format!("print-type-size t: `{:?}` err: {:?}", ty, err)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span_warn
? At least use the MIR's span.
FWIW I'd prefer this to either work like the existing variant size lint (looking at monomorphic definitions) or by inspecting the type layout cache after trans. Although I don't mind the current implementation, it feels more thorough, in an analysis way rather than a debug way. |
Great to see you working on this, @pnkfelix. My suggestions for this feature are in #37623 (comment). That example has a bit more detail, including variant names and the size of individual fields & the padding between them. Basically, assuming it's possible, don't leave the reader having to infer anything about the layout. |
@nnethercote thanks for the pointer to that issue. I'll see what I can do to try to make this output include more of the information you desire there |
r=me if @arielb1's and @nnethercote's concerns have been addressed. |
@pnkfelix, can you show some sample output for the updated code? Thank you. |
@nnethercote I haven't incorporated your ideas for the output yet. You can see the sample output in the ui tests that are in the PR. |
I'll give this an r+ for now and let's implement field-by-field size output in a separate PR. @bors r+ |
📌 Commit 089fbcf has been approved by |
(odd, I'm not clear why travis is failing in that manner; does it test a merge, and some of my imports have been taken away...?) Update: yeah that must have been what was going on; sorry for the noise. |
e05ae3a
to
ebe9ee1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! Thank you for adding the extra detail, it really helps.
I have a few nitpicks about the output below that I think will improve readability somewhat.
@@ -0,0 +1,21 @@ | |||
print-type-size type: `E1` overall bytes: 24 align: 8 | |||
print-type-size discrimimant bytes: 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You misspelled "discriminant".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha ha, whoops...
print-type-size discrimimant bytes: 4 | ||
print-type-size variant A exact bytes: 16 | ||
print-type-size padding bytes: 4 | ||
print-type-size field .0 bytes: 8 align: 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIUI rustc is going to be able to rearrange field order in the future to improve packing. In that case would the field numbers be listed out of order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code in principle can print the offset assigned to each field (the API includes the offsets being passed back). Or we can sort the variants by offset, and derive the implicit padding accordingly.
However my own experiments indicate that the computed offsets from the layout
module do not always match what we use in the representation, so I decided not to print out the offsets in this version of the code.
Whenever we get around to rearranging the field order, we will have to fix the layout
module to actually compute/maintain correct offset values, and we can revise this accordingly then.
print-type-size field .post bytes: 8 | ||
print-type-size type: `MyOption<IndirectNonZero<u32>>` overall bytes: 40 align: 8 | ||
print-type-size variant Some exact bytes: 40 | ||
print-type-size field .0 bytes: 40 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nitpicks:
- Do the "overall" and "exact" words contribute anything here? Can they be removed without loss of information?
- Inconsistent colon placement: in a different place for "type" compared to "field" and "variant".
- Perhaps the "bytes" should appear after the numbers, on the right-hand side.
- Field names should have backticks around them, like type names.
- A comma before the alignment would help readability.
- "alignment" is probably better than "align".
- First indent is 3 spaces, second indent is 4 spaces. Use 4 (or 2) consistently.
E.g.:
print-type-size type `MyOption<IndirectNonZero<u32>>`: 40 bytes, alignment: 8 bytes
print-type-size variant `Some`: 40 bytes
print-type-size field .0: 40 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The "exact" is there because the layout code makes a distinction between minimum size for a type (which is used for dynamically sized types) vs exact size. However, I never actually made a test that illustrates this, and furthermore, it does not seem to ever arise in my runs on the
rustc
crate, so maybe that is all dead code for the purposes of this analysis. - You're right about the colon placement. I'll think about your suggestion. (My instinct was to instead add colons after "variant" and "field" but your sample looks okay to me...)
- I can move the "bytes" around to occur after the number, sure.
- My reasoning behind the backticks is that we only need them for the types (since those have substructure in the form of their instantiated type parameters); the field and variant names never have any substructure, they are always just identifiers or numbers (I think?) so I figured no backticks needed. Note that in your sample you did not put "
.0
" - A comma before align/alignment sounds fine.
- Thanks, I thought I fixed the inconsistent indents but clearly never bothered to count spaces in the actual output in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'll go ahead and try putting in all of the suggestions and see how the output fares. I think my biggest issue might be with adding backticks to the variants and fields, but I'm willing to give it a shot.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds fine. I thought about suggesting backticks for the field names too, but the '.' seemed like a sufficient marker. But backticks around types and fields and variants is another possibility. I'll let you decide; there's not going to be One True Way to print this data :)
ebe9ee1
to
5b564cf
Compare
if offset != _lying_offset { | ||
// Experiments indicate that the offset extracted from Layout does | ||
// not always correctly account for padding. Thus the name :( | ||
debug!("offset: {} _lying_offset: {}", offset, _lying_offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like trouble. When does this happen?
Could you reproduce this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note: This comment has been revised several times; the most important change was correcting the listed path to the test to actually include the filename.)
It reproduces regularly on the test/ui/print_type_sizes/padding.rs
test. Output is like this when you turn the debug logging on:
print-type-size type: `E1`: 24 bytes, alignment: 8 bytes
print-type-size discriminant: 4 bytes
print-type-size variant `A`: 16 bytes
DEBUG:rustc::session::code_stats: offset: 8 _lying_offset: 0
print-type-size padding bytes: 4
print-type-size field `.0`: 8 bytes, alignment: 8 bytes
DEBUG:rustc::session::code_stats: offset: 16 _lying_offset: 8
print-type-size field `.1`: 4 bytes
print-type-size variant `B`: 8 bytes
DEBUG:rustc::session::code_stats: offset: 4 _lying_offset: 0
print-type-size field `.0`: 8 bytes
print-type-size end padding bytes: 4
print-type-size type: `E2`: 16 bytes, alignment: 8 bytes
print-type-size discriminant: 4 bytes
print-type-size variant `A`: 12 bytes
DEBUG:rustc::session::code_stats: offset: 4 _lying_offset: 0
print-type-size field `.0`: 4 bytes
DEBUG:rustc::session::code_stats: offset: 8 _lying_offset: 4
print-type-size field `.1`: 8 bytes
print-type-size variant `B`: 8 bytes
DEBUG:rustc::session::code_stats: offset: 4 _lying_offset: 0
print-type-size field `.0`: 8 bytes
print-type-size type: `S`: 8 bytes, alignment: 4 bytes
print-type-size field `.a`: 1 bytes
print-type-size field `.b`: 1 bytes
print-type-size padding bytes: 2
print-type-size field `.g`: 4 bytes, alignment: 4 bytes
(My assertion that the reported offset is "lying" is based on manual inspection of a value that I transmuted to a tuple.)
I've been assuming that this is not currently as serious a bug as one might think because I've been assuming that nothing important is using the offset values computed by the layout module. However, as I mentioned to @nnethercote in an (gh-marked "outdated") comment above,
Whenever we get around to rearranging the field order, we will have to fix the layout module to actually compute/maintain correct offset values, and we can revise this accordingly then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note in particular the main inconsistency illustrated here is that the so-called "lying-offset" does not account for (1) the discriminant nor (2) the padding injected (due to the discriminant) before the first field of a variant.
When I first looked at this I thought that the problem was just the first noted issue, which is trivially dealt with by adding the discriminant size to the offset. But the second issue is the killer: the reported offsets really should include any injected padding as well, and do not in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer ended up being that I was misunderstanding the representation being used in the layout
module.
a575d5d
to
559c135
Compare
@bors r+ |
📌 Commit 559c135 has been approved by |
@bors r=ariebl1 |
💡 This pull request was already approved, no need to approve it again.
|
📌 Commit 559c135 has been approved by |
Edit: Nevermind, that test isn't supposed to produce any output, is it? (In which case UI test will treat a missing file == empty output?). Edit2: Shouldn't it still have |
@bors r+ |
💡 This pull request was already approved, no need to approve it again.
|
📌 Commit 559c135 has been approved by |
Add debug flag `-Z print-type-sizes` for instrumention type/variant sizes Add debug flag `-Z print-type-sizes` for instrumention type/variant sizes This is meant to help with things like #36799 in a very local way; namely, once you have a hypothesis as to which types have a large population or are "too large", you can use `-Z print-type-sizes` to learn how large each type is, and how much each variant in an enum contributes to the size of that overall enum.
💔 Test failed - auto-mac-32-opt |
Legitimate failure. Type sizes are (of course) different in 32-bit platforms, and do not match the expected output. |
@sanxiyn hmm, yes of course. Odd that I didn't see this on my own local (32-bit arm) cross compiles; are u64's 8-byte aligned on arm-32bit? |
@TimNN you are right, (In the meantime I need to figure out if there's a way for me to encode target specific test output in the ui tests...) |
1a46118
to
75825fe
Compare
Biggest change: Revised print-type-sizes output to include breakdown of layout. Includes info about field sizes (and alignment + padding when padding is injected; the injected padding is derived from the offsets computed by layout module). Output format is illustrated in commit that has the ui tests. Note: there exists (at least) one case of variant w/o name: empty enums. Namely, empty enums use anonymous univariant repr. So for such cases, print the number of the variant instead of the name. ---- Also, eddyb suggested of reading from `layout_cache` post-trans. (For casual readers: the compiler source often uses the word "cache" for tables that are in fact not periodically purged, and thus are useful as the basis for data like this.) Some types that were previously not printed are now included in the output. (See e.g. the tests `print_type_sizes/generics.rs` and `print_type_sizes/variants.rs`) ---- Other review feedback: switch to an exhaustive match when filtering in just structural types. switch to hashset for layout info and move sort into print method. ---- Driveby change: Factored session::code_stats into its own module ---- incorporate njn feedback re output formatting.
Note that the tests have been updated to initialize the local variables; originally it was enough just to declare them. Back when I started this, the `layout_cache` contained entries even just for types that had been declared but not initialized. Apparently things have changed in the interim so that if I want one of those layouts to be computed, I need to actually initialize the value. (Incidentally, this shows a weakness in the strategy of just walking the `layout_cache`; the original strategy of using a MIR visitor would probably have exhibited more robustness in terms of consistent output, but it had other weaknesses so I chose not to reimplement it. At least, not yet.) ---- Also, I have updated tests to avoid target-specific alignments.
@bors r=arielb1 |
📌 Commit 75825fe has been approved by |
Add debug flag `-Z print-type-sizes` for instrumention type/variant sizes Add debug flag `-Z print-type-sizes` for instrumention type/variant sizes This is meant to help with things like #36799 in a very local way; namely, once you have a hypothesis as to which types have a large population or are "too large", you can use `-Z print-type-sizes` to learn how large each type is, and how much each variant in an enum contributes to the size of that overall enum.
Add debug flag
-Z print-type-sizes
for instrumention type/variant sizesThis is meant to help with things like #36799 in a very local way; namely, once you have a hypothesis as to which types have a large population or are "too large", you can use
-Z print-type-sizes
to learn how large each type is, and how much each variant in an enum contributes to the size of that overall enum.