-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use llvm.memset.p0i8.* to initialize all same-bytes arrays #135258
Conversation
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
32099f0
to
bed8b95
Compare
This comment has been minimized.
This comment has been minimized.
bed8b95
to
3ca023e
Compare
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Use llvm.memset.p0i8.* to initialize all same-bytes arrays It doesn't affect tests, LLVM seems smart enough for it, but then I wonder why we have the zero case at all (it was introduced in rust-lang#43488, maybe LLVM wasn't smart enough then). So let's run perf to see if there's any build time effect, and if no, I'll remove the zero special case and also run perf.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (7ed3665): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 5.2%, secondary -2.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -1.9%, secondary -3.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 763.722s -> 763.396s (-0.04%) |
3ca023e
to
45070ce
Compare
45070ce
to
280fbdb
Compare
Looks good! r=me if you want to tweak the commit history here or the PR description :) |
@bors r=saethlin |
280fbdb
to
65b01cb
Compare
@bors r=saethlin removed a FIXME that was fixed by this PR |
Treat undef bytes as equal to any other byte Basically since `undef` can be any byte, it can also be the byte(s) that are in the non-undef parts of a value. So we can just treat the `undef` at not being there and only look at the initialized bytes and memset over them fixes rust-lang#104290 based on rust-lang#135258
☀️ Test successful - checks-actions |
Treat undef bytes as equal to any other byte Basically since `undef` can be any byte, it can also be the byte(s) that are in the non-undef parts of a value. So we can just treat the `undef` at not being there and only look at the initialized bytes and memset over them fixes rust-lang#104290 based on rust-lang#135258
Finished benchmarking commit (a2d7c81): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary -1.7%, secondary 3.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 764.863s -> 763.776s (-0.14%) |
codegen: store ScalarPair via memset when one side is undef and the other side can be memset Basically since `undef` can be any byte, it can also be the byte(s) that are in the non-undef parts of a value. So we can just treat the `undef` at not being there and only look at the initialized bytes and memset over them fixes rust-lang#104290 based on rust-lang#135258
codegen: store ScalarPair via memset when one side is undef and the other side can be memset Basically since `undef` can be any byte, it can also be the byte(s) that are in the non-undef parts of a value. So we can just treat the `undef` at not being there and only look at the initialized bytes and memset over them fixes rust-lang#104290 based on rust-lang#135258
Similar to #43488
debug builds can now handle
0x0101_u16
and other multi-byte scalars that have all the same bytes (instead of special casing just0
)