-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: shorebird patch ios
builds run slower than android builds
#674
Comments
We did some benchmarking. Turns out dart code in our ios-alpha builds runs about 112x slower than on the real cpu. That's great news, since we know that this is already fast enough for some apps (especially since Dart code is only a portion of what the app itself does) and we know how to make it much faster. We're working on changes to reduce and possibly completely eliminate that slowdown. |
Further update. Kevin has made progress on a "mixed mode" for Dart (running in our custom AppStore-compliant interpreter only when necessary and otherwise executing on the real CPU). We expect to have early benchmarking numbers from that in the next couple of weeks and are still targeting a "1.0" release for iOS in September (with this, and various other iOS issues addressed). We'll post more here as we learn more. |
Update. Kevin has "mixed mode" working both entering and leaving the simulator / real cpu. His current demo uses a single snapshot it just runs all even-length functions on the CPU and all odd-length functions in the simulator. He's currently working through some crashes resulting from needing to teach the Dart garbage collector about the two separate stacks and the "stop execution" link-register markers the simulator uses. He's hoping to have the "Richards" benchmark running end-to-end with "mixed mode" this week. 🎉 |
shorebird patch ios
builds run about 10x slower than necessaryshorebird patch ios
builds run about 100x slower than necessary
Update: Kevin got Richards (the benchmark) running last week. It was still crashing 1/4 runs, but ran. He fixed several more crashers this morning and believes he's now working on "the last" garbage collection crasher. There is still a lot of work head to integrate "mixed mode" with Flutter, but we hope to start that as soon as this week once he works out a few more issues (like teaching "mixed mode" how to use two snapshots instead of just one) in the Dart stand-alone VM. 🤞 |
Thank you for the frequent updates @eseidel We are really excited to use this in production |
Any update on this? |
This week's update: Kevin has been working on getting Exception handling to work correctly in "mixed mode". (It's complicated because sometimes when an exception is thrown it may be thrown across multiple jumps in/out of the CPU/simulator either because of mixed-mode or because of other ways that Dart jumps to C++ and back, e.g. ffi.) He has most of the cases covered, but was still fighting at least one remaining case last I checked. I also experimented with wiring up mixed-mode into Flutter (so far development has been done in a stand-alone Dart VM) and found and filed a couple bugs. I still think we're close. I still am planning to get mixed mode shipped this month (the next two weeks) in an iOS "beta". But given that we've still not yet seen it work in Flutter (not uncommon to develop new Dart features in a simpler environment before moving them into Flutter), we won't know how many more days/weeks we need until we get to testing in Flutter. Good news is that our performance testing so far suggests that mixed-mode should be just as fast as normal release mode (so once we move to mixed mode we can close this bug)! |
So far testing with Mixed mode has gone so well we're already talking about moving Android to use it too over time. But again, first we have to get it shipped in iOS. Soon. 🤞 |
@eseidel thanks for keeping the thread updated, really looking forward to use shorebird in production. keep up the good work 💪. |
Update: So I saw mixed mode (the new fast dart runtime we're working on) run on my iPhone last Friday for the first time 🎉. Kevin is still fixing the last of the FFI test failures with mixed mode. After I saw mixed mode run, I then saw it fail to work with a second project, not because mixed mode itself failed on the phone, but rather due to a Dart snapshot mismatch trying to run the font tree-shaker on the host itself. I think that was due to some complexity in how our toolchain is overlayed on top of Flutter (e.g. that the snapshot for the tree-shaker script was built with mixed-mode but then tried to run with non-mixed-mode dart?). Regardless, we're close. I'm planning to work with Kevin this week to get a repeatable setup for running mixed-mode with Flutter and then we'll be able to evaluate how much work is left before we can ship iOS beta (with mixed mode) and Shorebird 1.0 soon after. |
Confirmed we have mixed-mode running on iOS. The current mixed mode builds from a single dart binary, just runs half of the functions on the CPU and half of them in the Simulator. It's way faster than our simulator-only system was before. We still need to teach mixed mode about loading from two binaries (the original one + the changed one) and correctly running the unchanged parts on the CPU and the changed parts on the simulator. Lots of progress, but still several weeks to go before we'll be shipping iOS beta. |
Status: Kevin and I have started work on the next step, the "linker", which is where we teach the new "mixed-mode" runtime how to stitch the new "patch" compiled dart code with the existing/signed dart code (included in the ipa/apk). We made some good progress the last couple days, but are having to add/improve the static analysis tools in Dart for compiled Dart code (the current tools are pretty limited and only work on some platforms). Hopefully we'll be able to create a "link table" to pass off to mixed-mode within the next few days. Once that's done, we'll get to see mixed mode run with a real updated binary for the first time and know what it's performance is going to look like (all signs are good so far). My current estimate is that the soonest we'll have iOS-beta out is probably 2 weeks from now. 🤞 After we have the linker working, we then should have a fully working system, it's just a question of how many more bugs we find once we're testing real flutter apps that will determine how long until we ship. |
Update: Sorry we had 3 of our 4 team members out sick for parts of last week. But we did make progress on iOS. We have what we believe to be a fully working "linker" which now knows how to analyze and understand not just function blocks but also direct jumps between those function blocks and produce a lookup table for the VM to use to know which function block sub-graphs it's safe to jump into on the CPU. We also realized that our approach of handling redirections in the "callee" with a function prologue was both insufficient for handling the indirect jump approach (we'll need to interpose indirect jumps with a stub of some form) as well as possibly superflous once we fix indirect jumps to use a stub (since direct jumps are now covered by our linker). So we've backtracked a bit to add the indirect jump support, which Kevin is working on finishing hopefully today 🤞. Once that's done just need to connect to the linking work we already did and hope to see positive results. We're still trying to get iOS beta out this week (even though that's likely aggressive at this point). 🤞 |
The current plan is to ship this new "mixed mode" Dart runtime as part of an "iOS Beta" for Shorebird as soon as possible. The reason to label is Beta is that because we're swapping out Dart's runtime that's likely to be a change that will be hard for us to fully validate on our own (even with all the various Flutter tests!), and we may need responses from customers to know if we've covered enough cases to be fully "1.0". The intent is to have "iOS Beta" out for a few weeks while we get any initial feedback on the new runtime and we continue to make improvements to the runtime. Our early testing of "mixed mode" it was way faster than our simulator-only current ios-alpha, but we won't have great performance data until we test it against full applications with updates, so there are likely still improvements we'll make to it after shipping. In any case, this is all to say, we're targeting: iOS Beta - ideally this week, likely will slip a week or two. iOS Alpha is out today and numerous companies are shipping with it. The major caveat being that iOS alpha uses a simulator-only runtime, which is slower than necessary (this bug). |
Update: We've started integrating our "mixed" branch into the "main" of our Dart fork and will be rolling it out to customers in stages. (We've landed about 1/3rd of the work from our "mixed" branch, about 2/3rds remaining which is what we're focused on this week.) The first stage is this week we will be shipping the bulk of our "mixed" (new fast mode) code, except without any of the "go fast" bits. This means we will be teaching the Dart VM how to jump back and forth between the CPU and Simulator, just always leaving it running on the Simulator for now. The performance will be un-changed for iOS, but the internals of how it's executing will be changed. So we will still call it "ios-alpha" even though we've converted the internals to work more like we expect the Dart VM to work going forward. Assuming that goes well, we will start turning on our new optimizations next week. As soon as we have enough optimizations turned on to see user-measurable performance change we will do the official "ios-beta" release and assuming that goes well follow soon with 1.0. |
We successfully finished integrating the "mixed" branch into our main Dart fork and we now have the necessary bots and testing set up to ensure our fancy new Dart passes all the same tests on all the same platforms the unmodified Dart VM does. My hope is for us to ship this new Dart VM as iOS-alpha later this week/early next which will not be any faster (we haven't turned on the jumping between CPU and simulator yet), but it will help us know that all our infrastructure is stable. While that's going on Kevin will be working on turning on the cpu/simulator jumping "mixing" and then there will be a phase of tuning. We also set up performance benchmarks today so we'll easily see the improvements as we roll them out. A lot is happening. Hope to have builds to share soon! |
I'm excited to share that we've successfully shipped the bulk of our "mixed mode" code in https://github.com/shorebirdtech/shorebird/releases/tag/v0.17.4. The "mixing" (jumping between the CPU and simulator) is still not turned on in that build, but the infrastructure is there. We added a ton of testing across the dart vm, flutter engine, and flutter framework (as well as some additional end-to-end) testing as part of releasing that. The important part is that now that this is shipped, it's very easy for us to make the changes necessary in a low-risk way to tune the "mixing" (and thus performance) of iOS builds going forward. It was many months to get here, but I'm hopeful that we're only a couple weeks away now from having our new faster "mixing" turned on by default. That's what Kevin and I are focused on right now and I hope to have good news to report soon. 🤞 |
HUGE progress so far this week. We have a working linker and a working mixed-mode landed on the main branch of our Dart fork. We've not yet finished all the work to integrate with Flutter and our Shorebird tooling, but we can start that soon. Flutter 3.16.0 and Dart 3.2.0 shipped today which is slightly unfortunate timing. I plan to upgrade us to both later this week (I expect the Dart upgrade may take a bit due to the changes in our fork). #1501. Once that's done we have some tuning to do to our current mixed mode before we'll call it Beta, but I'm hopeful we can release a build including this code in the next few days/week. |
Update: The entire company is working full time on completing this. So far we've spent this week bringing Felix and Bryan up to speed on the Dart VM internals. We've fixed a few bugs, but not yet fixed the two (known) crashers. I'm still hopeful we can ship the first version of a runtime with linking enabled and in use in all apps (the runtime has already shipped, but with linking disabled) as soon as tomorrow, but I won't know until we get past these blocking bugs. 🤞 |
Update: We got mixed mode fully working and passing the Dart tests tonight! The team is now up-to-speed on Dart VM hacking and has made great progress in the last week. The plan remains as above: we will roll this new "mixed mode" version of Dart into our Flutter tomorrow and assuming things go well will release that. That version will not be faster than current iOS Alpha, but will have all (including the linking and mixing) features of our new engine enabled (just turned down to their lowest settings). Assuming that goes OK we will then take a brief pause from the iOS work to upgrade to Flutter 3.16.x / Dart 3.2.x as a base revision later this week. We're then very excited to finally start turning the speed dials up and release iOS Beta, probably early next week! 🤞 Thank you all for your patience! |
Update: We continue to make progress, and continue to learn that there is more to to do. 2 days ago we tried mixed mode in Flutter and found it was never calling main(). We finally fixed that tonight after a long investigation. But we also set up a ton more testing for the Dart VM (both release and debug, both mixed-mode and not) and have found we have a couple GC issues with mixed-mode to solve as well. We also walked through all USING_SIMULATOR blocks in the dart vm tonight and found there are probably a few more of those we need to teach about the possibility of Dart state sometimes being on the CPU instead of in the Simulator. We are definitely getting closer, but as the 80/20 rule says, the last 20% is 80% of the work and we keep finding more work. We may take a pause from mixed-mode/iOS work to update to Dart 3.2 / Flutter 3.16.x and then resume. Will discuss with the team tomorrow. I don't have a shipping estimate at this point. I suspect we're only days away from shipping with --mixed-mode on, but given I've been saying that for several weeks now I can't really say how many more days. Soon. 🤞 |
Update: We accepted a 2-3 day delay to get 3.16.x released in 0.19.0: #1501, as well as to pay down some of our technical debt regarding Dart VM testing. (We set up several more test builders as well as did some work to figure out exactly what "passing" looks like, since the Dart VM tests don't "pass" in a default checkout from GitHub and require google-internal configuration information to know which tests are supposed to pass vs. fail 🤦, an issue we can work with the Dart team to fix at a later time, for now we've run the 3.2.3 test ourselves and generated new baselines which we are using to figure out which tests --mixed-mode has broken and which it hasn't.) Our last several releases have included all of our mixed-mode code, just off by default. I'm still hopeful we will ship mixed-mode (our new iOS runtime) by the end of the year. 🤞 With our new Dart testing baselines, we see about ~20 tests failing still in mixed mode, which is probably only ~5 bugs. Unfortunately these bugs are often quite hard, and may take a day or more each to solve, but at least we know exactly what we have left to fix now. I still have no firm ship date. Thanks for your continued patience. |
Update: We shipped at the end of last year! It's not fast (yet), but we shipped our fancier engine in 0.20.0 last month (sorry was very busy after and failed to update since). We've since turned up the speed dials (by allowing the linker to link function functions which jump from the CPU back to the simulator in addition to the current shipped version which just links functions which make no calls to other functions) on our local machines and are working through the resulting issues before releasing. We should have another release out this week which uses much less memory, and hopefully a release which is faster in the next week or two. It's still probably a few weeks before we can declare it "beta" or "stable", but it's hard to know since it's always hard to see past the next crasher. |
shorebird patch ios
builds run about 100x slower than necessaryshorebird patch ios-alpha
builds run slower than android builds
Update: We're down to one remaining failure across the entire Dart sdk testing suite (due to handling of exceptions during nested calls between cpu and simulator code) when using our improved linker (allows re-use of more code when patching). We've also fixed #675 and expect to release another major update to ios-alpha this coming week 🤞 . We haven't yet done performance testing, but we're still only linking ~80% binary in the zero-changes case (should be 100%) and < 20% of the binary in the small-changes case (should be 80%+), so we still have more work to do on the linker. We believe the bulk of the work is done for the new runtime however. We'll know more in the coming week(s). |
Update: We resolved all known crashers from last week and believe we're very close to resolving this now. We have a release (0.23.0) going out probably tomorrow that makes some of our benchmarks 10-50x faster. We're still not consistently "linking" as much code as we'd like and thus running as much code as we'd like on the CPU, so we have quite a bit more we can improve across our benchmarks (and for real-world apps). |
We just released 0.24.0, which runs iOS patches about 2x as fast as 0.23.0. 0.24.0 makes the Shorebird linker (the program that determines what parts of a patch can be run on the CPU vs. interpreter) resilient to functions moving within the Dart snapshot. Our current linker metrics:
We have one more big fix to make in the linker, which is making it resilient to dart objects changing their "index" in the snapshot, which happens any time an object (function, string, constant, type, class, etc.) is added or removed in the patch. We have a plan to fix, and that should make our "patch with small changes" run 90% or more on the CPU (nearly as fast as an unmodified release). We think what we have in 0.24.0 is good to call beta which we hope to release later this week (after checking with early adopters to make sure we've fixed any remaining bugs found in our new iOS engine). We hope to follow up with the above-mentioned heap-index fix for the linker in the following weeks and then we can finally close this bug! 🚀 |
Today we released 0.24.1, which makes |
shorebird patch ios-alpha
builds run slower than android buildsshorebird patch ios
builds run slower than android builds
Just released beta! 🥳. 0.25.0. We've started the process of fixing our remaining 1.0 blockers, including what we believe to be the last bug in our linker causing us to only run 30% of code on the CPU for most Flutter changes (and thus having them be much slower than needed). |
The week after we released beta we spent chasing several crashers in our new iOS engine. Those were all fixed by last Friday, but we were not able to release due to one of the crashers requiring an upstream Dart change to resolve. Dart 3.3.0 had the change we needed, so we released Shorebird 0.26.0 today with the final iOS crasher fix (that we're aware of). We're now back to finishing this bug for 1.0 which we hope to resolve in the next couple weeks. Felix is actively working on the remaining fix (making pool pointer offsets stable between the base and patch even in the face of objects being added/removed from the object pool). I expect it will take us a week or two to fix that bug and then we'll launch 1.0. 🤞 I don't expect to update this issue until that's resolved. Edit: #1665 is the issue we're using to track the pool-pointer offset stability issue. |
Any update? |
Update: We continue to work on a fix for the last big iOS speed bug (the "object pool index" stability bug). We coded up one proposed fix this week, but it ended up being wrong. Felix and I came up with a new plan on Friday which we'll be attempting tomorrow. I'm pretty optimistic this next approach will work. If it does, we will plan to release the fix this week and hopefully even be able to call it 1.0. |
looking forward to accelerating the release of version 1.0. In addition, we have a requirement, can this be met? Due to company security issues, we hope to place the patch on our own server. Can this requirement be supported or not? Can you give us a suggestion and solution?" |
@fengnudexiaoniao As I understood and unfortunately, this is not the priority of ShoreBird team for the moment. There is these iOS issues to solve first. However, it's on the scope, as it's part of common feature request, see #485 for the associated issue. |
As @sbatezat mentions, this bug (about iOS speed) is where we're spending all our time at the moment (and is our only remaining issue blocking our 1.0 release). We don't currently offer self-hosting, #485 is the issue tracking such. |
Update: Resolving this bug continues to be our entire focus. My infrequent updates have not been due to lack of interest, rather just we've been writing code all day. 🤣 In the last two weeks we built a whole new linking pipeline for main() {
print("world"); // object_pool[10]
} And changed it to after: main() {
print("hello"); // object_pool[10]
print("world"); // object_pool[11]!
} In the before case, "world" might have an object pool offset of 10, but in the after case "hello"'s offset would change to 11. Meaning both that any code which references offset 10 we can't use (it would get "hello" instead of "world") and any code which is trying to reference "world" would now encoded the value "11" somewhere in the compiled code and thus be slightly different from otherwise identical code pre-patch. What's worse is that this doesn't just affect "hello" and "world" in your program, but every constant after the one inserted or removed changes index (including ones in otherwise unrelated files, functions, packages, etc.), which means all code seen by the compiler after the constant change can no longer link. Since the big trick that Shorebird uses to make patches fast is by re-using as much code as possible from the release binary, object pool index changes like these mean we can re-use less code, and thus patches run slower. The new-new linker we wrote in the last 2 weeks is smart enough to sort the object pool during patch compile time to make it look as identical as possible to the unpatched object pool, thus making the patch's code maximally similar to the unpatched code and thus allowing us to use as much of the unpatched code as possible (and thus having patches run super fast). Unfortunately working in this part of the Dart compiler is incredibly complicated as the code is not well documented and quite fragile. However after ~2 weeks of work, we have a thing which works now and only fails about ~50 more tests (out of ~10k) We expect we should have those few remaining tests fixed early next week and then we will know if this gets us the speed we expect, or if there is more work to do. The early signs we got when starting this object-pool aware linker, was that fixing this object pool sorting would move us from linking (and thus-reusing) ~30% of a test to linking 99% of a test. We did find some tests which didn't move to 99%, so I still have some potential concern there may be another bug we have to fix after this one before patches run full-speed all the time, but we'll know very soon. We may also choose to fix #1731 before we call this 1.0 (shouldn't be very hard) as several of our beta users have run into that. Regardless I remain (possibly naively) hopeful that we're only another ~week or two away from this bug being resolved and shipping 1.0. 🤞 |
Update: 😮💨 We've finally solved the ObjectPool sorting issue (described in my last post). We also did an audit of remaining issues impacting linking and found two more, one of which blocks 1.0 (and should be easy to fix) and one of which will not (and we haven't evaluated how hard it will be to fix yet). So I think we're down to our last few days. We have 4 bugs left on our burn down which we're tracking on our project board and I expect for us to be able to resolve this bug next week! 🤞 |
🎉 Today is the day. We just released Shorebird 0.28.0 which includes fixes to make most iOS patches run as fast as they would on Android. Which means that most apps run just as fast after patching as they did before patching. There are still a few cases where our fancy new iOS "linker" isn't able to figure out how to run as much code on the CPU as it should. But those I think are best tracked as separate remaining bugs rather than keeping this one open. e.g. #1825. We did the huge architectural lift to make this possible, and I think what remains now are "just bugs". Thank you all for your patience these many months! Please give 0.28.0 a try on iOS and let us know if you hit any issues. We're otherwise preparing for 1.0 next week. #1731 is our one remaining iOS blocker. |
In our testing, some applications seem to run about 2-10x slower. ios-alpha only makes the Dart portion slower, and Dart is only on portion of what any given application does (it runs a lot of C++ code etc). Our internal benchmarks show that some portions of Dart run up to 100x slower when run in our interpreter (our original ios-alpha runtime from July 2023).
Update, Jan 2024: We've built a new Dart runtime "mixed mode" which knows how to execute unchanged functions on the CPU and changed ones in an interpreter. This is currently released as
ios-alpha
as of Dec 2023. We're currently in the process of stabilizing this new runtime as well as turning up the speed -- the initial release is currently limited to "linking" only "leaf" functions (functions which do not call any other functions) and runs at similar speeds to the original ios-alpha runtime, we expect we can make it 10x to 100x faster in the coming weeks.The text was updated successfully, but these errors were encountered: