Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a release branch ABI stability regression test #87891

Closed
gpshead opened this issue Apr 4, 2021 · 37 comments
Closed

Create a release branch ABI stability regression test #87891

gpshead opened this issue Apr 4, 2021 · 37 comments
Labels
3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir topic-C-API type-feature A feature request or enhancement

Comments

@gpshead
Copy link
Member

gpshead commented Apr 4, 2021

BPO 43725
Nosy @gpshead, @vstinner, @tiran, @ned-deily, @encukou, @ambv, @zooba, @pablogsal
PRs
  • bpo-43725: Add CI step to check changes in the exported ABI #25188
  • [3.9] bpo-43725: Add CI step to check changes in the exported ABI #25230
  • [3.8] bpo-43725: Add CI step to check changes in the exported ABI #25232
  • [3.8] bpo-43725: Add ignore file for the abidump check #25322
  • [3.9] bpo-43725: Add ignore file for the abidump check #25323
  • [3.9] Revert "[3.9] bpo-43725: Add ignore file for the abidump check" #25394
  • Files
  • compat_report.html
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-04-04.04:35:13.558>
    labels = ['build', '3.8', '3.9', 'expert-C-API', 'type-feature', 'tests', 'interpreter-core', '3.10']
    title = 'Create a release branch ABI stability regression test'
    updated_at = <Date 2021-10-15.14:16:28.469>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2021-10-15.14:16:28.469>
    actor = 'pablogsal'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Build', 'Interpreter Core', 'Tests', 'C API']
    creation = <Date 2021-04-04.04:35:13.558>
    creator = 'gregory.p.smith'
    dependencies = []
    files = ['49935']
    hgrepos = []
    issue_num = 43725
    keywords = ['patch']
    message_count = 36.0
    messages = ['390173', '390184', '390203', '390205', '390207', '390208', '390213', '390214', '390310', '390311', '390312', '390313', '390315', '390316', '390319', '390325', '390372', '390374', '390382', '390660', '390668', '390670', '390985', '390986', '390987', '390994', '391007', '391008', '391009', '391010', '391011', '392205', '392207', '392208', '404019', '404022']
    nosy_count = 8.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'christian.heimes', 'ned.deily', 'petr.viktorin', 'lukasz.langa', 'steve.dower', 'pablogsal']
    pr_nums = ['25188', '25230', '25232', '25322', '25323', '25394']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue43725'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    Linked PRs

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 4, 2021

    In order to automate prevention of ABI regressions in stable releases, we could create an automated ABI stability test generator and check the specific ABI test it generates into each specific release branch.

    I'm envisioning the main branch only having a code generator that creates such a test, and the release branches only having the output of that as Lib/tests/release_X_Y_ABI_stability_test.py and a policy of never updating that within a release branch without *extreme* attention to detail.

    Such updates wouldn't happen by default in our current workflow as they're unique and versioned in every release branch so automated backport PRs wouldn't touch them - leaving CI to run them and highlight failures on attempted backports to do inadvertently cause an ABI shift.

    @gpshead gpshead added 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir topic-C-API type-feature A feature request or enhancement labels Apr 4, 2021
    @pablogsal
    Copy link
    Member

    This is probably complementary or in the avenue of https://www.python.org/dev/peps/pep-0652/

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 4, 2021

    Indeed. In particular given the 3.9.3 issue I was assuming such a test should include asserting both the sizeof() ABI structs and offsetof() public members of all ABI structs. On each specific first class supported platform.

    This goes beyond what https://www.python.org/dev/peps/pep-0652/#testing-the-stable-abi currently states to check size and layout rather than just symbol presence. But seems to match the intent.

    @pablogsal
    Copy link
    Member

    We could have a buildbot using https://github.com/lvc/abi-compliance-checker

    @pablogsal
    Copy link
    Member

    For example, the tool generates this report for the two 3.9 versions (attached to the issue).

    @pablogsal
    Copy link
    Member

    Also, we can use libabigail. For instance:

    root@7a3947dec3d8:/pytho# abidiff Python-3.9.2/python Python-3.9.3/python
    Functions changes summary: 0 Removed, 3 Changed (53 filtered out), 0 Added functions
    Variables changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added variable

    3 functions with some indirect sub-type change:

    [C]'function void PyEval_AcquireThread(PyThreadState*)' at ceval.c:381:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
    in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
    underlying type 'struct _ts' at pystate.h:51:1 changed:
    type size hasn't changed
    4 data member changes (2 filtered):
    'char _ts::recursion_critical' offset changed from 296 to 320 (in bits) (by +24 bits)
    'int _ts::stackcheck_counter' offset changed from 320 to 352 (in bits) (by +32 bits)
    'int _ts::tracing' offset changed from 352 to 384 (in bits) (by +32 bits)
    'int _ts::use_tracing' offset changed from 384 to 416 (in bits) (by +32 bits)
    1 data member change:
    type of 'char _ts::overflowed' changed:
    type name changed from 'char' to 'int'
    type size changed from 8 to 32 (in bits)
    and name of '_ts::overflowed' changed to '_ts::recursion_headroom' at pystate.h:61:1

    [C]'function int _PyErr_CheckSignalsTstate(PyThreadState*)' at signalmodule.c:1684:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
    in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
    underlying type 'struct _ts' at pystate.h:51:1 changed:
    type size hasn't changed
    no data member changes (6 filtered);
    no data member change (1 filtered);

    [C]'function void _PyErr_Clear(PyThreadState*)' at errors.c:426:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
    in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
    underlying type 'struct _ts' at pystate.h:51:1 changed:
    type size hasn't changed
    no data member changes (6 filtered);
    no data member change (1 filtered);

    @pablogsal
    Copy link
    Member

    Ok, so seems that PR25188 works if the abi dump file for the "correct" version is generated with the same compiler that is used to check the ABI. I think this is acceptable if the workflow is:

    • As soon as a version is released, we generate in the stable release branch the dump using some docker container with the same compiler as the GitHub CI.

    • We enable the ABI check only for release branches.

    • We should make the check mandatory because release managers cannot be checking all backport PRs unfortunately.

    We make the changes for 3.8 and 3.9 for the time being.

    @pablogsal
    Copy link
    Member

    Adding the rest of RM to evaluate the proposed solution

    @zooba
    Copy link
    Member

    zooba commented Apr 6, 2021

    Anything is better than nothing, from my POV. Let's get it running, tweak it, and if it doesn't work for us then take it down.

    @pablogsal
    Copy link
    Member

    Ok, will create PRs for the release branches that are receiving fixes

    @vstinner
    Copy link
    Member

    vstinner commented Apr 6, 2021

    Also, we can use libabigail.

    RHEL uses that to provide ABI guarantees on the kernel and the glibc.

    @tiran
    Copy link
    Member

    tiran commented Apr 6, 2021

    Do we need separate jobs and ABI dumps for each platform and arch? I guess we need at least separate dumps for 32 and 64bit.

    @pablogsal
    Copy link
    Member

    Do we need separate jobs and ABI dumps for each platform and arch? I guess we need at least separate dumps for 32 and 64bit.

    Not really, check what happened in my 64 build system when I did the changes that broke the ABI in the latest 3.9 release:

    7a3947dec3d8">root@7a3947dec3d8:/pytho# abidiff Python-3.9.2/python Python-3.9.3/python
    Functions changes summary: 0 Removed, 3 Changed (53 filtered out), 0 Added functions
    Variables changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added variable

    3 functions with some indirect sub-type change:

    [C]'function void PyEval_AcquireThread(PyThreadState*)' at ceval.c:381:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
    in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
    underlying type 'struct _ts' at pystate.h:51:1 changed:
    type size hasn't changed
    4 data member changes (2 filtered):
    'char _ts::recursion_critical' offset changed from 296 to 320 (in bits) (by +24 bits)
    'int _ts::stackcheck_counter' offset changed from 320 to 352 (in bits) (by +32 bits)
    'int _ts::tracing' offset changed from 352 to 384 (in bits) (by +32 bits)
    'int _ts::use_tracing' offset changed from 384 to 416 (in bits) (by +32 bits)
    1 data member change:
    type of 'char _ts::overflowed' changed:
    type name changed from 'char' to 'int'
    type size changed from 8 to 32 (in bits)
    and name of '_ts::overflowed' changed to '_ts::recursion_headroom' at pystate.h:61:1

    [C]'function int _PyErr_CheckSignalsTstate(PyThreadState*)' at signalmodule.c:1684:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
    in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
    underlying type 'struct _ts' at pystate.h:51:1 changed:
    type size hasn't changed
    no data member changes (6 filtered);
    no data member change (1 filtered);

    [C]'function void _PyErr_Clear(PyThreadState*)' at errors.c:426:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
    in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
    underlying type 'struct _ts' at pystate.h:51:1 changed:
    type size hasn't changed
    no data member changes (6 filtered);
    no data member change (1 filtered);

    @encukou
    Copy link
    Member

    encukou commented Apr 6, 2021

    Not sure what platforms libabigail works on, but the set of stable ABI symbols is platform-specific. Currently it's affected by the MS_WINDOWS and HAVE_FORK defines.

    @zooba
    Copy link
    Member

    zooba commented Apr 6, 2021

    I assume it's only doing source analysis, so we can probably force those flags on for the check? Or run multiple checks with the options, but without having to switch platform.

    Even under MS_WINDOWS, I hope we're not using declarations from the Windows header files in our own public API. If so, those are worth replacing (safely, over time).

    @pablogsal
    Copy link
    Member

    If I understand correctly, this is analyzing the DWARF information on the binaries, so is quite coupled to the platform you compile to, but you can detect violations that affect other platforms if they share the same code.

    @gpshead
    Copy link
    Member Author

    gpshead commented Apr 6, 2021

    Ideally we'd analyze for a representative set of major platforms we ship binaries on. 32-bit and 64-bit Linux are a start, but we should assume that at least windows and linux toolchains may be different and toss in an additional CPU architecture as default alignment constraints could differ. But getting to that level can remain future work... I'm happy libabigail exists!

    @pablogsal
    Copy link
    Member

    Yeah, I subscribe what Greg said.

    Let's start with *something* and let's improve upon.

    @pablogsal
    Copy link
    Member

    I started with two PRs against the 64 bits versions for 3.9 and 3.8. This *should* cover 32 bits as well (see previous messages), but if we want specific changes for that we need a 32 bit machine or cross-compilation, but I decided to start with something (that something being 64 bits)

    @zooba
    Copy link
    Member

    zooba commented Apr 9, 2021

    I got a false positive on my PR at https://github.com/python/cpython/pull/25318/checks?check_run_id=2308871807

    1 Changed variable:

    [C]'const unsigned char[45154] const _Py_M__importlib_bootstrap_external' was changed to 'const unsigned char[43681] const _Py_M__importlib_bootstrap_external' at importlib_external.h:2:1:
    size of symbol changed from 45154 to 43681
    type of variable changed:
    'const unsigned char[45154] const' changed to 'const unsigned char[43681] const'

    Is there an option to exclude array lengths? Or to treat the API as just a pointer rather than an array?

    @pablogsal
    Copy link
    Member

    Is there an option to exclude array lengths? Or to treat the API as just a pointer rather than an array?

    Technically is not a false positive:

    https://developers.redhat.com/blog/2019/05/06/how-c-array-sizes-become-part-of-the-binary-interface-of-a-library/

    But in this case I think it is. We can have a ignore file:

    https://sourceware.org/libabigail/manual/libabigail-concepts.html#suppr-spec-label

    We could ignore all functions that start with _Py.

    @zooba
    Copy link
    Member

    zooba commented Apr 9, 2021

    Oh wow, that's terrible... yet another good reason not to export data values.

    But yeah, filtering on the name prefix should be fine. These aren't meant to be publicly accessible anyway.

    @vstinner
    Copy link
    Member

    Oh wow, that's terrible... yet another good reason not to export data values.

    _Py_M__importlib_bootstrap_external symbol doesn't seem to be exported. Why is it seen as a public symbol?

    $ objdump -T /lib64/libpython3.10.so.1.0|grep _Py_M__importlib_bootstrap_external
    # empty output

    @vstinner
    Copy link
    Member

    We could ignore all functions that start with _Py.

    Some symbols starting with _Py are indirectly part of the ABI. Example of Include/cpython/pyctype.h:

    #define Py_ISLOWER(c)  (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_LOWER)
    PyAPI_DATA(const unsigned int) _Py_ctype_table[256];

    Even if "_Py_ctype_table" is not directly part of the C API, it's technically part of the ABI.

    If tomorrow, _Py_ctype_table is truncated to 128 items, it would be an incompatible ABI change.

    @vstinner
    Copy link
    Member

    In a Python stable version, I would suggest to only ignore an ABI change after a manual validation .Otherwise, we can miss real issues.

    Well, I expect that at the beginning, we will discover many issues like _Py_M__importlib_bootstrap_external ;-)

    @pablogsal
    Copy link
    Member

    Ok, then we need to revert by latest 2 PRs and add a new label or something to skip

    @vstinner
    Copy link
    Member

    Ok, then we need to revert by latest 2 PRs and add a new label or something to skip

    To skip ABI tests? I proposed to ignore ABI changes if they can be ignored safely. But I don't know right now which changes can be ignored or not.

    @pablogsal
    Copy link
    Member

    To skip ABI tests? I proposed to ignore ABI changes if they can be ignored safely. But I don't know right now which changes can be ignored or not.

    As the situation stands we need to choose on keeping my changes to ignore _Py function changes or revert it. Further than that is unexplored land

    @pablogsal
    Copy link
    Member

    $ objdump -T /lib64/libpython3.10.so.1.0|grep _Py_M__importlib_bootstrap_external
    # empty output

    This was happening on 3.8 Victor:

    root@d9c5942e274b:/src# objdump -T libpython3.8.so | grep _Py_M__importlib_bootstrap_external
    0000000000335740 g DO .rodata 000000000000b062 Base _Py_M__importlib_bootstrap_external

    @pablogsal
    Copy link
    Member

    I am going to revert the 3.9 PR as 3.9+ is hiding symbols by default.

    @vstinner
    Copy link
    Member

    I got a false positive on my PR at https://github.com/python/cpython/pull/25318/checks?check_run_id=2308871807

    Oh, this CI failure was on Python 3.8.

    Since Python 3.9, Python is now built with -fvisibility=hidden to avoid exporting symbols which are not explicitly exported. So in 3.9, we no longer have to ignore all symbols prefixed by "_Py".

    @vstinner
    Copy link
    Member

    I cannot merge my PR 25685 because the mandatory ABI CI job fails:
    ---------------
    abidiff "libpython3.9.so" ./Doc/data/python3.9.abi --drop-private-types --no-architecture --no-added-syms
    Functions changes summary: 0 Removed, 1 Changed, 0 Added function
    Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

    1 function with some indirect sub-type change:

    [C]'function int _PyInterpreterState_IDIncref(PyInterpreterState*)' at pystate.c:497:1 has some indirect sub-type changes:
    return type changed:
    type name changed from 'int' to 'void'
    type size changed from 32 to 0 (in bits)
    ---------------

    It is correct that my PR changes an internal C API on purpose.

    Pablo suggests me to regenerate the ABI file but I don't know how to do that.

    In Python 3.9, the GitHub Action uses:
    ---
    check_abi:
    name: 'Check if the ABI has changed'
    runs-on: ubuntu-20.04
    needs: check_source
    if: needs.check_source.outputs.run_tests == 'true'
    steps:
    - uses: actions/checkout@v2
    - uses: actions/setup-python@v2
    - name: Install Dependencies
    run: |
    sudo ./.github/workflows/posix-deps-apt.sh
    sudo apt-get install -yq abigail-tools
    - name: Build CPython
    env:
    CFLAGS: -g3 -O0
    run: |
    # Build Python with the libpython dynamic library
    ./configure --enable-shared
    make -j4
    - name: Check for changes in the ABI
    run: make check-abidump
    ---

    I'm using Fedora 33 with "gcc (GCC) 11.0.1 20210324 (Red Hat 11.0.1-0)".

    On Fedora, I used:
    ---

    $ sudo dnf install -y libabigail
    $ ./configure --enable-shared CFLAGS="-g3 -O0" && make -j10
    $ make regen-abidump
    $ git diff --stat
     Doc/data/python3.9.abi | 28478 +++++++++++++++++++++++++++++++++

    1 file changed, 18306 insertions(+), 10172 deletions(-)
    ---

    There are tons of changes!

    Also, "make check-abidump" lists many changes:
    ---
    abidiff "libpython3.9.so" ./Doc/data/python3.9.abi --drop-private-types --no-architecture --no-added-syms
    Functions changes summary: 0 Removed, 13 Changed (171 filtered out), 0 Added functions
    Variables changes summary: 0 Removed, 6 Changed (2 filtered out), 0 Added variables

    13 functions with some indirect sub-type change:
    (...)
    [C] 'function PyStatus _PyInterpreterState_Enable(_PyRuntimeState*)' at pystate.c:171:1 has some indirect sub-type changes:

    [C] 'function int _Py_DecodeLocaleEx(const char*, wchar_t**, size_t*, const char**, int, _Py_error_handler)' at fileutils.c:574:1 has some indirect sub-type changes:
    parameter 6 of type 'typedef _Py_error_handler' has sub-type changes:

    [C] 'const unsigned char _Py_ctype_tolower[256]' was changed to 'const unsigned char[256] const _Py_ctype_tolower' at pyctype.h:26:1:
    type of variable changed:
    entity changed from 'const unsigned char[256]' to 'const unsigned char[256] const'
    (...)
    ---

    @pablogsal
    Copy link
    Member

    As I mentioned here:

    https://bugs.python.org/msg390213

    the dump needs to be generated in a docker container using the same compiler version that is used in the CI

    @pablogsal
    Copy link
    Member

    You are using Fedora, which is not the same docker container and likely the same compiler version that is used to check the dump

    @vstinner
    Copy link
    Member

    the dump needs to be generated in a docker container using the same compiler version that is used in the CI

    I'm not used to docker and I don't know how to get a docker similar than the one used by GitHub Action. Is there a documentation somewhere giving commands to get the docker image and how to run it?

    @pablogsal
    Copy link
    Member

    I will add something to the devguide explaining how to do it. In any case, the RM should be ideally the one doing this because every potential ABI breakage need to be supervised by the RM (even if is a false positive).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @CAM-Gerlach
    Copy link
    Member

    Can this be closed now? Seems like this was implemented (came across this seeing that it apparently needs to be added for 3.12 just now).

    Yhg1s pushed a commit that referenced this issue May 23, 2023
    Backport the workflow change and fix-ups:
    - GH-92442 (e89c01e)
    - GH-94129 (0dadb22)
    - GH-98556 (194588d)
    
    Co-Authored-By: sterliakov <50529348+sterliakov@users.noreply.github.com>
    Co-authored-by: Hugo van Kemenade <hugovk@users.noreply.github.com>
    Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes build The build process and cross-build interpreter-core (Objects, Python, Grammar, and Parser dirs) tests Tests in the Lib/test dir topic-C-API type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    7 participants