-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emscripten ABI compatibility checks for side modules? #15917
Comments
As of now we don't have any guarantees like that. Would it be possible to treat the entire emscripten version as the ABI version for now? Or perhaps that would be too limiting? Perhaps if would work if you selected just a few specific emscripten releases to support? Can you explain the user case a little more? Is is idea that folks who use the emscripten version of python would be able to download pre-built side modules? Is there some way to disable pre-built binaries and have the side module always compile from source instead? Or might these users not even have emscripten itself installed? |
This is still in an exploratory phase, so we don't have a completely clear set of requirements (and other people involved might have slightly different ideas about what our requirements are).
Yeah exactly. The way Pyodide currently works is that we build a bunch of side modules with Python packages in tree with the Python interpreter. The way the broader Python system works is that different people build wheels for different targets -- packages build and distribute their own wheels, wheels for Raspberry Pi are built by people other than the original author and distributed on https://www.piwheels.org/, etc. In order to try to make this work, wheels are tagged with some platform information which tries to indicate which systems they can run on. The Python people suggest that in the long term we could add Emscripten as a supported wheel platform, but we would need to define the compatibility more precisely. Ideally we should discuss our plans with the Emscripten team before doing that.
End users probably don't have Emscripten and won't rebuild packages. Many people making applications based on Pyodide have a webdev background struggle with a complicated and unfamiliar toolchain like Emscripten, though if we can make the tools easy to install and use then more people will manage.
Yeah, these are certainly options. I guess we have a pretty different use case from typical Emscripten users who are porting fixed applications like games and don't have any need for side module dependency management. |
Don't get me wrong, as stable ABI for dyanmic linking is certainly something we would like to have in the future. It could be that this use case is compelling enough to try to make progress on it. |
Hi, The lack of clear ABI compatibility between Emscripten versions is very problematic for us. Our product is game middleware that is pluggable in various game engines like Unity, which supports the Web platform via Emscripten. I'll direct you to Unity's documentation page on external native plug-ins, which is what our product is: https://docs.unity.cn/2023.3/Documentation/Manual/webgl-native-plugins-with-emscripten.html In particular, this line in their documentation explains the problem clearly:
Since we distribute pre-compiled binaries for our plug-in, we must provide binaries that are supported across all maintained Unity versions. Each Unity version uses a different version of Emscripten. Because ABI compatibility is not clearly defined between Emscripten versions, it is difficult for us to determine which version of Emscripten we should use to support Unity. We got bit by this recently as we upgraded to a new minor version of Emscripten, 3.1.52 and then bumped into #20233 when integrating into Unity because they use 3.1.38. I humbly suggest the Emscripten project revises its approach to versioning to more clearly advertise when ABI compatibility is broken. |
@akpmilot, in a world where we were able to detect ABI breakages and document them, presumably you would still need a process for updating your libraries to match unity's versions right? With that process in place, can you not work with the current status quo which is basically that each emscripten release should be considered and ABI breaking release? i.e. is this just an issue or degree? You can publish new versions of your library for emscripten releases but you would rather not do it for all releases? |
Honestly its hard for me to imagine very many emscripten releases not breaking some kind of ABI, especially when you consider than native object files and libraries can contains JS code which can refer to the entire JS library code in emscripten. Any change the emscripten's JS library code could conceivable be and ABI breakage in that case. |
Would it help if we somehow embedded the emscripten version in each object file or shared library so that linking objects from different versions would cause an error? At least that might help the problem show up quicker perhaps? |
I think that would be helpful. Currently if you load a wrong shared library sometimes the only sign is a "memory access out of bounds error". |
@sbc100 Publishing different binaries targeting different versions of Emscripten is a possibility. In fact, it is something we do for other platforms. For example, in the Apple ecosystem ABI breakage is allowed between major versions of Xcode. Therefore, we publish separate binaries for the Xcode 14.x series and Xcode 15.x currently. This works for platforms where ABI compatibility is clearly defined. However, if the definition of ABI breakage for Emscripten is "every release", we would not be able to do this. The reason is that we do not only support Unity games, but also games based on other engines, including custom ones for which we cannot predict the version of Emscripten being used. It's simply not feasible for us to publish binaries for every Emscripten release. On the subject of the actual definition of "ABI compatibility", you bring up a good point that Emscripten is quite special due to the ability to embed Javascript code in the object files. I think it would be reasonable to exclude the Javascript environment from the definition of ABI. After all, Javascript code is interpreted at runtime, not precompiled; one can always write embedded Javascript in a way that tests for the existence of functions and global objects to cover multiple variants of the environment. What if Emscripten continued to follow a X.Y.Z numbering scheme, but native code ABI is maintained for a given X.Y, while the Javascript environment is allowed to change at every Z release? Would that make it more feasible? |
Can you elaborate a little on what makes this not feasible? Why is it feasible to do this once in a while but not for every release? Are there a lot of manual steps involved? If so I wonder if this could somehow be automated? Perhaps via github actions or some other mechanism? |
In such as scheme the "native code ABI" would also include all of libc and libc++ and all the other native libraries that get included in the main module (libwasmfs, libmalloc, libembind, etc, etc). So I think it might also be rare that a release doesn't include some change to those libraries. Remember that even an internal bug fix that doesn't change the interface can break users of the library who we somehow dependent on that bug. |
I've been thinking recently we should drop the X.Y.X versioning completely and move to simple XX version number like chrome or firefox. An interesting argument again compound versions: The last time we considered bumping the minor version we decided not so since some folks like to be able to effectively measure time/distance between two version using the X version alone. i.e. one can look at two versions and see roughly how far apart they are in time. e.g. Chrome 60 and Chrome 100 are clearly several years apart. |
Build our product takes about 4-5 minutes for Emscripten. We already produce two builds: one with pthreads, and one without. We also have three build configs: Debug, Profile, and Release, although our builds are parallelized so they don't strictly add up; total build time for all Emscripten artifacts in our nightly is about 12 minutes. The resulting archive is about 117 megs. If we were to support every Emscripten version between 3.1.38 (what Unity 2023 uses) and 3.1.8 (what Unity 2022 uses), we're already at 6 hours and 3.5 GB of artifacts. That's an unrealistic ask, considering 90% of these binaries will never be used by anybody.
I am not a toolchain developer, but as I understand it, these are the same concerns that other LLVM-based platforms like Xcode and the Android NDK deal with. Unreal Engine also offers some guarantees about ABI compatibility using their LLVM cross-platform toolchain: https://docs.unrealengine.com/5.3/en-US/linux-development-requirements-for-unreal-engine/
Regardless of the numbering scheme, have you considered producing less frequent "LTS" releases? Perhaps this could be the solution for projects like ours and Unity's that need to agree on a version to use. If we were to standardize on "LTS" releases that happen only at specific intervals (once a year, for example), it would simplify things. As middleware developers, it would be reasonable for us to simply state "we will only provide binaries for LTS releases of Emscripten" and ask downstream projects to use LTS as well. |
@akpmilot's situation is quite similar to what the Pyodide downstream faces. I would say that Emscipten ABI instability is the largest problem for the Python-Emscripten packaging ecosystem. Python itself has decided that having package authors compile against many different ABIs is untenable. manylinux and abi3 make it possible for packages to upload just one version for glibc linux and have it work for a long time. There is a perfectly viable way forward for us by only updating the Emscripten version when we also update the Python version. That way people only have one version per year. Unity could also consider yearly updates. But the combination of:
is difficult to deal with. If Emscripten either had better ABI compatibility between versions or used stable llvm, I think things would be much better for us. As it is, updating the Emscripten version once a year is a bit painful to consider. |
I'll take a look at that, thanks.
We don't currently have any concept of LTS releases. We support all our releases equally, but we also don't so any patching of existing releases. This sounds like a maybe something that could be worked out between you and your users. For example, you could say "we only support on in every 10 releases" (e.g. 3.1.10 and 3.1.20)? |
They probably will want to ask unity to do this since telling their users that their plugin will only work with certain versions of unity isn't ideal. |
The landscape in gaming is quite complex, it's not strictly a question of upstream/downstream users. You have game studios, game engine providers, middleware providers, and various platform toolchains. Each game project might have to juggle with a game engine AND 2-3 middleware plug-ins, all of which must agree on the same ABI for a deployment platform. |
Surely the game engine picks the ABI and the plugin developers go along with it? |
Most game middleware products support several game engines, as well as basic native libraries for use by custom game engines. For example, in addition to the 4 Emscripten versions required by the 4 different supported Unity versions, you also have Godot wanting 1.39.9 and Unreal Engine 4.26 wanting 1.38.31. This is why a clear stable ABI becomes necessary. Coming back to Apple's case, every party can agree that "Xcode 15" is the target, and libraries built with Xcode 15 can be linked together. |
There has been discussion with the CPython folks about trying to add
wasm32-emscripten
wheels to PyPI. For this, we would like to have some way to decide which wheels are compatible with which main modules. Looking around, I see there used to beEMSCRIPTEN_ABI_MAJOR
andEMSCRIPTEN_ABI_MINOR
but they were removed.Suppose we compile a main module and a side module at separate times using possibly different versions of Emscripten. Is there any way to check whether they are compatible? Could such a feature be added? What will happen at load time if they are not compatible?
The text was updated successfully, but these errors were encountered: