RFC to remove old Cranelift x86 backend. #12

cfallin · 2021-06-21T18:50:09Z

This RFC describes the process by which we hope to remove the old Cranelift x86 backend, with the appropriate consensus beforehand and the appropriate cleanups afterward.

It has been spurred by @bjorn3's work to draft the technical work of actually snipping it out in bytecodealliance/wasmtime#3009 (thanks!).

accepted/remove-old-cranelift-backend.md

bjorn3 · 2021-06-21T19:03:55Z

accepted/remove-old-cranelift-backend.md

+backend?
+
+Question 2: Is there any functionality in the old backend that we have
+not yet adequately replicated in the new backend?


Maybe couple of specialized instructions?

bjorn3 · 2021-06-21T19:05:07Z

accepted/remove-old-cranelift-backend.md

+   least register definitions and platform-specific legalizations. We
+   will need to replace some of the legalizations that the new backend
+   relies on with handwritten versions in the `simple_legalize`
+   framework.


Only simple_legalize runs for the new backends AFAIK.

Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>

bnjbvr

Thanks for writing this! I'd be actually happy to see the old backend go away :-)

Is there any functionality in the old backend that we have not yet
adequately replicated in the new backend?

Unfortunately, as much as I love the new backend, the old backend still has better performance in terms of compile times. I know that @cfallin has been working on a new register allocator (thanks!), but early results don't suggest a large improvement in terms of compile times compared to the current regalloc.rs implementation that's used in the new backend, so even with those two we might still be slower at compiling code than the old backend. In the future, with the new constraint system, the new register allocator ought to be faster for sure, but there is a tiny risk that even then, the old-backend would compiler faster.

Of course, compilation speed isn't the main worry, and the new backend has only brought improvements in terms of maintainability, lowering complexity and improving the quality of the generated code. I simply slightly worry that if we abandoned the old backend too eagerly, then we would send the message that we don't really support the JIT use-case (viz. fast compile times), and that wasmtime ought to be used only in AOT contexts.

Alternatively, we could make acceptance of this RFC conditional on the new backend compiling not being X% slower at compilation than the old backend is, or just plan and prototype new improvements that ought to lower compile times in the new backend, so we can confidently avoid the risk of just getting worse compile times.

Thoughts?

accepted/remove-old-cranelift-backend.md

Co-authored-by: Benjamin Bouvier <public@benj.me>

cfallin · 2021-06-24T17:46:08Z

Indeed, I just spoke with @bnjbvr, and I agree with his main point: it would be really great to solve the compile-time issue before we remove the old backend, at least with the current set of tradeoffs. The old backend's maintenance and complexity costs impose on us more as time goes on, but for now (certainly for the next few months) this is minimal.

The other point I hadn't considered was that, at some point as I continue to work on compile-time performance, next with instruction-selector work (which should move us over to a constraint-based API for the regalloc as @bnjbvr notes, hopefully with good effects), I'll want to actually compare with the old backend as well to show conclusively that we've improved compile time. For that reason, it would be useful to keep it working until then.

It's still worthwhile to discuss how the removal would actually happen, and what simplifications it would allow, of course, so I think we should keep discussing that as ideas occur to us!

tschneidereit · 2021-06-24T17:48:43Z

Would it make sense to at least move the old backend to some sub-directory that clearly identifies it as legacy? Right now, it's not at all obvious which code one should look at as a newcomer to Cranelift, and fixing that seems quite valuable.

cfallin · 2021-06-24T18:02:11Z

Yes, I agree, moving isa/x86/ to isa/legacy/x86/ or something like that would be a good first-step-to-removal. I'll put together a quick PR for that.

These backends will be removed in the future (see bytecodealliance/rfcs#12 and the pending bytecodealliance#3009 in this repo). In the meantime, to more clearly communicate that they are using "legacy" APIs and will eventually be removed, this PR places them in an `isa/legacy/` subdirectory. No functional changes otherwise.

bjorn3 · 2021-06-24T19:21:24Z

The old backend's maintenance and complexity costs impose on us more as time goes on, but for now (certainly for the next few months) this is minimal.

One thing I noticed with my PR to remove it from cranelift-codegen-meta was that compilation and execution of cranelift-codegen-meta becomes near instantaneously instead of taking like 20s. I think compilation of cranelift-codegen also became a bit faster. Stripping the whole old backend will likely improve compilation time of cranelift-codegen a lot and reduce the compiled size significantly.

bnjbvr · 2021-09-16T16:50:25Z

A clarification on my previous comment: we're not using the old backend anymore, and we would be fine removing it, as we've already taken and accepted the compile-times hit that came with it.

The previous comment rather expressed a touch of worry that despite all its greatness, the new backend is a bit slower than the older one in terms of compilation times, which performance hasn't improved a lot over the last few months (not blaming anyone, of course; anyone, including us, could have contributed patches to that effect).

The Cranelift team does have a plan now for improving this situation, by 1. using the new regalloc in the future, 2. using a better instruction selector to make use of the new regalloc. But even this plan has considerable risks, as we don't have a clear view on what the final compile-time performance will be, when compared to just keeping the old backend.

Overall, my point is that we should make sure that the next framework improvement will not result in another compile-time performance hit, that we wouldn't be ready to take. For one, switching to the new regalloc before using its full capacities would be such a compile-time hit. I am confident that this isn't a controversial point of view, and I do note that the forces at hand are working hard towards lower compile-times in general, which is highly appreciated.

cfallin · 2021-09-16T17:51:31Z

Thanks @bnjbvr!

I agree 100% that getting compile-times down is the most important goal we have in front of us w.r.t. performance work, and things should not become any slower than they are. The ongoing odyssey with regalloc performance has been a frustration to me personally but of course it's just a really fundamentally hard problem. Earlier perf benchmarking with regalloc2 and the SSA-based inputs, together with profiling results when using the regalloc.rs shim, suggest to me that the many reg-to-reg moves inserted when lowering to VCode are a significant source of slowdown; so the hope (educated guess?) is that with the operand-constraints model in the native regalloc2 API, this should get much better. That's still a risk until we get there and actually measure, though, as you note.

My thought w.r.t. regalloc2 and ISLE is also that we will need to carefully measure perf before flipping any more switches, and I personally agree that we need to be really careful to avoid any regressions. Possibly this means that ISLE comes online first and we get to the point that we can then use regalloc2's API natively before switching it on by default, and just use the shim for fuzzing it until then; I'm not sure (and this should be discussed in a separate RFC). But for the topic at hand, I'll say for the record here that I agree, compile-perf should and must improve.

cfallin · 2021-09-16T17:54:44Z

(Though as an addendum to the above, I should note that last time I measured regalloc2 with the shim, it still improved compile-time performance slightly on most benchmarks, and more significantly in worst-cases like clang.wasm, so I guess I only mean that we should measure carefully, and possibly regalloc2 comes in before ISLE and still offers an intermediate improvement. Again, topic for another RFC!)

cfallin · 2021-09-16T18:02:13Z

cfallin · 2021-09-17T07:22:47Z

We have signoffs from multiple groups now, so:

Entering Final Call Period

https://github.com/bytecodealliance/rfcs/blob/main/accepted/rfc-process.md#making-a-decision-merge-or-close

Once any stakeholder from a different group has signed off, the RFC will move into a 10 calendar day final comment period (FCP), long enough to ensure that other stakeholders have at least a full business week to respond.

The FCP will end on Tuesday, September 28.

uweigand · 2021-09-17T09:02:55Z

This is fine with me.

cfallin · 2021-09-28T19:01:01Z

The FCP has now ended with no further discussion, so it's now time to merge this RFC. Thanks all for the many good points and discussion above! As a next step, we can look at moving forward the existing draft PR to remove the old x86 backend, and then further cleanups from there.

RFC to remove old Cranelift x86 backend.

e74a43e

cfallin force-pushed the remove-old-backend branch from 5561603 to e74a43e Compare June 21, 2021 18:51

bjorn3 reviewed Jun 21, 2021

View reviewed changes

accepted/remove-old-cranelift-backend.md Outdated Show resolved Hide resolved

bjorn3 reviewed Jun 21, 2021

View reviewed changes

Typo fix from @bjorn3

e5ca7c2

Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>

abrown approved these changes Jun 22, 2021

View reviewed changes

bnjbvr reviewed Jun 24, 2021

View reviewed changes

accepted/remove-old-cranelift-backend.md Outdated Show resolved Hide resolved

Fix typo (thanks @bnjbvr).

7026b20

Co-authored-by: Benjamin Bouvier <public@benj.me>

cfallin mentioned this pull request Jun 24, 2021

cranelift-codegen: move old x86 and RISC-V backends to isa/legacy/. bytecodealliance/wasmtime#3028

Merged

cfallin mentioned this pull request Aug 30, 2021

Use relative call instructions between wasm functions bytecodealliance/wasmtime#3254

Closed

cfallin added disposition-merge motion-to-finalize labels Sep 16, 2021

bjorn3 approved these changes Sep 16, 2021

View reviewed changes

peterhuene approved these changes Sep 16, 2021

View reviewed changes

sparker-arm approved these changes Sep 17, 2021

View reviewed changes

cfallin added the final-comment-period label Sep 17, 2021

afonso360 mentioned this pull request Sep 19, 2021

Implement VhighBits & Vselect for interpreter bytecodealliance/wasmtime#3361

Merged

cfallin mentioned this pull request Sep 20, 2021

Cranelift: handle carry flags as bools in backends bytecodealliance/wasmtime#2860

Closed

cfallin merged commit 2821d03 into bytecodealliance:main Sep 28, 2021

cfallin deleted the remove-old-backend branch September 28, 2021 19:01

bjorn3 mentioned this pull request Oct 1, 2021

[PSA] The Evergreen Public Service Announcements Thread bytecodealliance/wasmtime#1096

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC to remove old Cranelift x86 backend. #12

RFC to remove old Cranelift x86 backend. #12

cfallin commented Jun 21, 2021

bjorn3 Jun 21, 2021

bjorn3 Jun 21, 2021

bnjbvr left a comment

cfallin commented Jun 24, 2021

tschneidereit commented Jun 24, 2021

cfallin commented Jun 24, 2021

bjorn3 commented Jun 24, 2021

bnjbvr commented Sep 16, 2021

cfallin commented Sep 16, 2021

cfallin commented Sep 16, 2021

cfallin commented Sep 16, 2021 •

edited by mingqiusun

Loading

cfallin commented Sep 17, 2021

uweigand commented Sep 17, 2021

cfallin commented Sep 28, 2021

RFC to remove old Cranelift x86 backend. #12

RFC to remove old Cranelift x86 backend. #12

Conversation

cfallin commented Jun 21, 2021

bjorn3 Jun 21, 2021

Choose a reason for hiding this comment

bjorn3 Jun 21, 2021

Choose a reason for hiding this comment

bnjbvr left a comment

Choose a reason for hiding this comment

cfallin commented Jun 24, 2021

tschneidereit commented Jun 24, 2021

cfallin commented Jun 24, 2021

bjorn3 commented Jun 24, 2021

bnjbvr commented Sep 16, 2021

cfallin commented Sep 16, 2021

cfallin commented Sep 16, 2021

cfallin commented Sep 16, 2021 • edited by mingqiusun Loading

Motion to finalize with a disposition to merge

Stakeholders sign-off

Arm

DFINITY

Embark Studios

Fastly

Google/Envoy

Intel

Microsoft

Mozilla

IBM

wasmCloud

Unaffiliated

cfallin commented Sep 17, 2021

Entering Final Call Period

uweigand commented Sep 17, 2021

cfallin commented Sep 28, 2021

cfallin commented Sep 16, 2021 •

edited by mingqiusun

Loading