-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a release branch ABI stability regression test #87891
Comments
In order to automate prevention of ABI regressions in stable releases, we could create an automated ABI stability test generator and check the specific ABI test it generates into each specific release branch. I'm envisioning the main branch only having a code generator that creates such a test, and the release branches only having the output of that as Lib/tests/release_X_Y_ABI_stability_test.py and a policy of never updating that within a release branch without *extreme* attention to detail. Such updates wouldn't happen by default in our current workflow as they're unique and versioned in every release branch so automated backport PRs wouldn't touch them - leaving CI to run them and highlight failures on attempted backports to do inadvertently cause an ABI shift. |
This is probably complementary or in the avenue of https://www.python.org/dev/peps/pep-0652/ |
Indeed. In particular given the 3.9.3 issue I was assuming such a test should include asserting both the sizeof() ABI structs and offsetof() public members of all ABI structs. On each specific first class supported platform. This goes beyond what https://www.python.org/dev/peps/pep-0652/#testing-the-stable-abi currently states to check size and layout rather than just symbol presence. But seems to match the intent. |
We could have a buildbot using https://github.com/lvc/abi-compliance-checker |
For example, the tool generates this report for the two 3.9 versions (attached to the issue). |
Also, we can use libabigail. For instance: root@7a3947dec3d8:/pytho# abidiff Python-3.9.2/python Python-3.9.3/python 3 functions with some indirect sub-type change: [C]'function void PyEval_AcquireThread(PyThreadState*)' at ceval.c:381:1 has some indirect sub-type changes: [C]'function int _PyErr_CheckSignalsTstate(PyThreadState*)' at signalmodule.c:1684:1 has some indirect sub-type changes: [C]'function void _PyErr_Clear(PyThreadState*)' at errors.c:426:1 has some indirect sub-type changes: |
Ok, so seems that PR25188 works if the abi dump file for the "correct" version is generated with the same compiler that is used to check the ABI. I think this is acceptable if the workflow is:
We make the changes for 3.8 and 3.9 for the time being. |
Adding the rest of RM to evaluate the proposed solution |
Anything is better than nothing, from my POV. Let's get it running, tweak it, and if it doesn't work for us then take it down. |
Ok, will create PRs for the release branches that are receiving fixes |
RHEL uses that to provide ABI guarantees on the kernel and the glibc. |
Do we need separate jobs and ABI dumps for each platform and arch? I guess we need at least separate dumps for 32 and 64bit. |
Not really, check what happened in my 64 build system when I did the changes that broke the ABI in the latest 3.9 release: 7a3947dec3d8">root@7a3947dec3d8:/pytho# abidiff Python-3.9.2/python Python-3.9.3/python 3 functions with some indirect sub-type change: [C]'function void PyEval_AcquireThread(PyThreadState*)' at ceval.c:381:1 has some indirect sub-type changes: [C]'function int _PyErr_CheckSignalsTstate(PyThreadState*)' at signalmodule.c:1684:1 has some indirect sub-type changes: [C]'function void _PyErr_Clear(PyThreadState*)' at errors.c:426:1 has some indirect sub-type changes: |
Not sure what platforms libabigail works on, but the set of stable ABI symbols is platform-specific. Currently it's affected by the MS_WINDOWS and HAVE_FORK defines. |
I assume it's only doing source analysis, so we can probably force those flags on for the check? Or run multiple checks with the options, but without having to switch platform. Even under MS_WINDOWS, I hope we're not using declarations from the Windows header files in our own public API. If so, those are worth replacing (safely, over time). |
If I understand correctly, this is analyzing the DWARF information on the binaries, so is quite coupled to the platform you compile to, but you can detect violations that affect other platforms if they share the same code. |
Ideally we'd analyze for a representative set of major platforms we ship binaries on. 32-bit and 64-bit Linux are a start, but we should assume that at least windows and linux toolchains may be different and toss in an additional CPU architecture as default alignment constraints could differ. But getting to that level can remain future work... I'm happy libabigail exists! |
Yeah, I subscribe what Greg said. Let's start with *something* and let's improve upon. |
I started with two PRs against the 64 bits versions for 3.9 and 3.8. This *should* cover 32 bits as well (see previous messages), but if we want specific changes for that we need a 32 bit machine or cross-compilation, but I decided to start with something (that something being 64 bits) |
I got a false positive on my PR at https://github.com/python/cpython/pull/25318/checks?check_run_id=2308871807 1 Changed variable: [C]'const unsigned char[45154] const _Py_M__importlib_bootstrap_external' was changed to 'const unsigned char[43681] const _Py_M__importlib_bootstrap_external' at importlib_external.h:2:1: Is there an option to exclude array lengths? Or to treat the API as just a pointer rather than an array? |
Technically is not a false positive: But in this case I think it is. We can have a ignore file: https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppr-spec-label We could ignore all functions that start with _Py. |
Oh wow, that's terrible... yet another good reason not to export data values. But yeah, filtering on the name prefix should be fine. These aren't meant to be publicly accessible anyway. |
_Py_M__importlib_bootstrap_external symbol doesn't seem to be exported. Why is it seen as a public symbol? $ objdump -T /lib64/libpython3.10.so.1.0|grep _Py_M__importlib_bootstrap_external
# empty output |
Some symbols starting with _Py are indirectly part of the ABI. Example of Include/cpython/pyctype.h: #define Py_ISLOWER(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_LOWER)
PyAPI_DATA(const unsigned int) _Py_ctype_table[256]; Even if "_Py_ctype_table" is not directly part of the C API, it's technically part of the ABI. If tomorrow, _Py_ctype_table is truncated to 128 items, it would be an incompatible ABI change. |
In a Python stable version, I would suggest to only ignore an ABI change after a manual validation .Otherwise, we can miss real issues. Well, I expect that at the beginning, we will discover many issues like _Py_M__importlib_bootstrap_external ;-) |
Ok, then we need to revert by latest 2 PRs and add a new label or something to skip |
To skip ABI tests? I proposed to ignore ABI changes if they can be ignored safely. But I don't know right now which changes can be ignored or not. |
As the situation stands we need to choose on keeping my changes to ignore _Py function changes or revert it. Further than that is unexplored land |
This was happening on 3.8 Victor: root@d9c5942e274b:/src# objdump -T libpython3.8.so | grep _Py_M__importlib_bootstrap_external |
I am going to revert the 3.9 PR as 3.9+ is hiding symbols by default. |
Oh, this CI failure was on Python 3.8. Since Python 3.9, Python is now built with -fvisibility=hidden to avoid exporting symbols which are not explicitly exported. So in 3.9, we no longer have to ignore all symbols prefixed by "_Py". |
I cannot merge my PR 25685 because the mandatory ABI CI job fails: 1 function with some indirect sub-type change: [C]'function int _PyInterpreterState_IDIncref(PyInterpreterState*)' at pystate.c:497:1 has some indirect sub-type changes: It is correct that my PR changes an internal C API on purpose. Pablo suggests me to regenerate the ABI file but I don't know how to do that. In Python 3.9, the GitHub Action uses: I'm using Fedora 33 with "gcc (GCC) 11.0.1 20210324 (Red Hat 11.0.1-0)". On Fedora, I used: $ sudo dnf install -y libabigail
$ ./configure --enable-shared CFLAGS="-g3 -O0" && make -j10
$ make regen-abidump
$ git diff --stat
Doc/data/python3.9.abi | 28478 +++++++++++++++++++++++++++++++++ 1 file changed, 18306 insertions(+), 10172 deletions(-) There are tons of changes! Also, "make check-abidump" lists many changes: 13 functions with some indirect sub-type change: [C] 'function int _Py_DecodeLocaleEx(const char*, wchar_t**, size_t*, const char**, int, _Py_error_handler)' at fileutils.c:574:1 has some indirect sub-type changes: [C] 'const unsigned char _Py_ctype_tolower[256]' was changed to 'const unsigned char[256] const _Py_ctype_tolower' at pyctype.h:26:1: |
As I mentioned here: https://bugs.python.org/msg390213 the dump needs to be generated in a docker container using the same compiler version that is used in the CI |
You are using Fedora, which is not the same docker container and likely the same compiler version that is used to check the dump |
I'm not used to docker and I don't know how to get a docker similar than the one used by GitHub Action. Is there a documentation somewhere giving commands to get the docker image and how to run it? |
I will add something to the devguide explaining how to do it. In any case, the RM should be ideally the one doing this because every potential ABI breakage need to be supervised by the RM (even if is a false positive). |
Can this be closed now? Seems like this was implemented (came across this seeing that it apparently needs to be added for 3.12 just now). |
Backport the workflow change and fix-ups: - GH-92442 (e89c01e) - GH-94129 (0dadb22) - GH-98556 (194588d) Co-Authored-By: sterliakov <50529348+sterliakov@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com> Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: