-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Only process changed obligations in ObligationForest #69218
Conversation
r? @eddyb (rust_highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 7f32c87ad1fb9dd11733035bf98a8614748ee282 with merge a495eb849d6fa99996c180ca6e3012f402970a96... |
☀️ Try build successful - checks-azure |
Queued a495eb849d6fa99996c180ca6e3012f402970a96 with parent 5e7af46, future comparison URL. |
3970fea
to
98a1b65
Compare
Shrank the snapshotting overhead quite a bit which removed most of the slowdown in |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 4b13fa94c89d243fa1c0148610996eaaf84ba955 with merge e3b5470b757aa1e38c8fb2feecc31966ce79513c... |
☀️ Try build successful - checks-azure |
Queued e3b5470b757aa1e38c8fb2feecc31966ce79513c with parent b0d5813, future comparison URL. |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Pushed a commit which pulls in some modifications to https://github.com/rust-lang-nursery/ena as a git dependency. Disablde the tidy check for git dependencies to get it to build for a perf run. With the modification to ena it was possible to reduce the snapshot overhead almost entirely which at least seems to fix the regression in If the Another perf run would be nice in any case. |
(I have held of on documenting and cleaning this up so far as much is still in flux while I try to fix the regressions). |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 56f916fab9aa2d644f3f11cf50ac282a7fa530e4 with merge 2c67e37fa0d8c84ac85061c83cf8a39ceeb24abd... |
☀️ Try build successful - checks-actions |
Queued 30a3e111a6443337457980ccf94a22d0b8c9d02e with parent 9f6c670, future comparison URL. |
Finished benchmarking try commit (30a3e111a6443337457980ccf94a22d0b8c9d02e): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Interesting, there does appear to be more green, but probably still too much red overall... |
Added a commit to always use the simple, current obligation forest processing. A perf run should show how much the complexity added to the obligation forest slows things down. |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit d3d3fc4 with merge 77eafb1583b505bea6e1537365a7ae77bc3febf4... |
☀️ Try build successful - checks-actions |
Queued 77eafb1583b505bea6e1537365a7ae77bc3febf4 with parent 20b1e05, future comparison URL. |
Finished benchmarking try commit (77eafb1583b505bea6e1537365a7ae77bc3febf4): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Is a comparison of the current version which does not use the precise tracking at all, it uses the same way to work through the obligationforest. All benchmark excepts one are improved by the precise processing so it does not appear as if the obligation forest is the culprit for the slow down. The added (Sorry for the spam, I'd run this locally but I can't seem to get reliable results for such relatively small changes). |
That doesn't look like a proper comparison - the parent commits on master for those two are different. If you want to re-run the perf builds we can do so, give me a ping on Zulip and I should be able to help out. |
Since perf runs comparions on merge commits I don't think there is a way to have have proper parents for perf runs on try builds? It only matters if there are significant performance changes in merged in between though (I am only looking for a rough comparison here). |
There's two strategies:
|
☔ The latest upstream changes (presumably #78767) made this pull request unmergeable. Please resolve the merge conflicts. Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:
|
Ping from triage - |
I'm not sure why this was closed :/ the perf results look really good, I'd love to have this improvement. |
It only improves things in a few specific instances otherwise it is a a regression due to the increased cost of the added data structures https://perf.rust-lang.org/compare.html?start=9f6c670c4b4273b2c7c0af07a524d4240c926bfc&end=30a3e111a6443337457980ccf94a22d0b8c9d02e . I have fixed some of that locally but I don't believe it makes up enough of the regression so for the moment this isn't an improvement. |
Personally I think -50% on inflate is absolutely worth it, the regressions are much more minor in comparison, and the doc times are pretty variable anyway. |
The wins are on an old version of inflate which heavily abuses macros instead of inline functions, causing an absolutely enormous function to be generated. So I would expect wins like that to be exceedingly rare in the average crate, the regressions on the other hand affects almost every crate so overall it would definitely be a loss. |
This rewrites most of ObligationForest forest so to avoid iterating
through the entire set of obligations on each
select
attempt. Insteadonly the obligations that can actually make progress are processed. This
gives great speedups in benchmarks such as
inflate
which create alarge number of pending obligations which fail to make progress.
To support this the unification tables were extended to keep track
of which type inference variables that has actually changed at each
step. Which then lets
ObligationForest
get a list of only the changedvariables at each step which it can map back to its obligations.
In addition to this primary change, many of the other iterations in
ObligationForest
were refactored to only process lists of the nodesthat they actually are interested in. The extra bookkeeping needed for
this was possible without the primary change but were a performance
regressions there as they slowed down the main loop. As the main loop is
no longer the main issue these optimizations could be re-applied.
This needs some unit tests to the added data structures (I have some in an external crate I experimented in which I need to pull in), the
warn!
debug logs need to go/madedebug!
and I am not happy with the howOffsets
are deregistered.I would interested in getting some feedback on the overall PR and seeing some perf results though (the extra complexity to in unification could see some regressions, there may ways to reduce the added overhead there further).