Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.10] bpo-43760: Ensure that older Cython generated code compiles under 3.10 #28498

Closed
wants to merge 3 commits into from

Conversation

pablogsal
Copy link
Member

@pablogsal pablogsal commented Sep 21, 2021

  • Add tstate.use_tracing as a copy of tstate.cframe->use_tracing to minimize breakage in Cython generated modules.
  • Move new field to end of struct for better binary compatibility.
  • regenerate ABI file

https://bugs.python.org/issue43760

@pablogsal
Copy link
Member Author

@vstinner
Copy link
Member

This change breaks the ABI after Python 3.10.0rc1, whereas it was said that the ABI is not going to change, no?

While PyThreadState is excluded from the limited C API and so the stable ABI, it's part of the public C API and the regular ABI.

See for example the discussion on the numpy issue asking to get wheel packages for Python 3.10rc1:
numpy/numpy#19630 (comment)

cc @hroncok

@@ -148,6 +148,7 @@ struct _ts {
uint64_t id;

CFrame root_cframe;
int use_tracing;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we add a comment explaining this field only exists for backwards compatibility and is only updated just before calling a trace function? (And in a few other odd places, but the key is that it's not the source of truth any more.)

@ambv
Copy link
Contributor

ambv commented Sep 21, 2021

Sorry, we cannot do this. We communicated loud for people to start providing 3.10 wheels with the release of 3.10rc1. They responded, there are already 3.10 wheels on PyPI for the following packages among the 2,500 most downloaded:

asyncpg
Brotli
coverage
dbt_extractor
elastic_apm
gevent
geventhttpclient
google_crc32c
greenlet
httptools
kiwisolver
lxml
MarkupSafe
numpy
orjson
pandas
Pillow
pybase64
pycairo
pyinstrument
PyMuPDF
pyzmq
pyzstd
regex
sentencepiece
simplejson
tgcalls
uvloop
watchdog

The only thing to do now is to communicate the API change in "What's New in Python 3.10".

@ambv ambv closed this Sep 21, 2021
@gvanrossum
Copy link
Member

I hope the 3.10 release manager is in agreement. :-)

@pablogsal
Copy link
Member Author

@vstinner @ambv The ABI is not broken, the only thing that this PR change is the size of the struct. All the offsets to the members are the same and therefore will be valid in any compiled code.

Any compiled wheels will still work. Look at the ABI report:

  [C]'function void PyEval_AcquireThread(PyThreadState*)' at ceval.c:447:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          1 data member changes (2 filtered):
           type of 'PyInterpreterState* _ts::interp' changed:
             in pointed to type 'typedef PyInterpreterState' at pystate.h:22:1:
               underlying type 'struct _is' at pycore_interp.h:220:1 changed:
                 type size hasn't changed
                 1 data member changes (3 filtered):
                  type of '_PyFrameEvalFunction _is::eval_frame' changed:
                    underlying type 'PyObject* (PyThreadState*, PyFrameObject*, int)*' changed:
                      in pointed to type 'function type PyObject* (PyThreadState*, PyFrameObject*, int)':
                        parameter 1 of type 'PyThreadState*' has sub-type changes:
                          in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
                            underlying type 'struct _ts' at pystate.h:62:1 changed:
                              type size changed from 2240 to 2304 (in bits)
                              1 data member insertion:
                                'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
                              1 data member changes (2 filtered):
                               type of '_ts* _ts::next' changed:
                                 in pointed to type 'struct _ts' at pystate.h:62:1:
                                   type size changed from 2240 to 2304 (in bits)
                                   1 data member insertion:
                                     'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
                                   no data member changes (2 filtered);




  [C]'function PyThreadState* PyGILState_GetThisThreadState()' at pystate.c:1455:1 has some indirect sub-type changes:
    return type changed:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          no data member changes (4 filtered);

  [C]'function int _PyErr_CheckSignalsTstate(PyThreadState*)' at signalmodule.c:1767:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          no data member changes (3 filtered);

  [C]'function void _PyErr_Clear(PyThreadState*)' at errors.c:453:1 has some indirect sub-type changes:
    parameter 1 of type 'PyThreadState*' has sub-type changes:
      in pointed to type 'typedef PyThreadState' at pystate.h:20:1:
        underlying type 'struct _ts' at pystate.h:62:1 changed:
          type size changed from 2240 to 2304 (in bits)
          1 data member insertion:
            'int _ts::use_tracing', at offset 2240 (in bits) at pystate.h:151:1
          no data member changes (3 filtered);

As you can see, the leaves of the change is only type size changed from 2240 to 2304. As the member is added at the end of the struct, the ABI of the rest of the members is maintained and therefore nothing using those fields can break

@ambv ambv reopened this Sep 21, 2021
@ambv
Copy link
Contributor

ambv commented Sep 21, 2021

Carry on then, but we still need to communicate this in "What's New".

@pablogsal
Copy link
Member Author

I also opened this PR to discuss the possibility of going ahead, I still don't want to move on without consensus.

@vstinner
Copy link
Member

What happens with C extensions setting tstate->tracing = 0; or tstate->tracing = 1; on purpose? Do they continue to work as expected?

@vstinner
Copy link
Member

I would prefer to add a new C API function to get and set the "tracing" state: https://bugs.python.org/issue43760#msg393410

But I'm not sure how to implement such API properly.

@pablogsal
Copy link
Member Author

What happens with C extensions setting tstate->tracing = 0; or tstate->tracing = 1; on purpose? Do they continue to work as expected?

That field is undocumented, there is no guarantee of anything. The thread state is not part of the limited API or the stable API, so people using the field are out of contract.

I would prefer to add a new C API function to get and set the "tracing" state: https://bugs.python.org/issue43760#msg393410

But I'm not sure how to implement such API properly.

That cannot be done in Python 3.10.

@vstinner
Copy link
Member

That field is undocumented, there is no guarantee of anything. The thread state is not part of the limited API or the stable API, so people using the field are out of contract.

I don't understand why do you want adding again the field if you consider that removing it was an incompatible change.

IMO it's part of the public C API. The contact with users is that they should use things prefixed by an underscore. But PyThreadState.tracing has no underscore, so it can be used.

I'm not sure that it's useful to decide if PyThreadState.tracing is public or not. Projects use it and are broken by this incompatible change (as explained in the issue). The question is how can we help these projects to migrate to Python 3.10?

  • Add again the field: either than change, or revert the change which introduced PyThreadState.root_cframe. There are the ABI question (it seems to not be an issue in practice), and the backward compatibility of C extensions setting the field.

  • Don't add again the field, but help these projects to migrate to Python 3.10.

    • About the C API, it can be added to Python 3.11 and provided to Python 3.10 and older in https://github.com/pythoncapi/pythoncapi_compat (or people can copy/paste code if they prefer). Or explain how Cython fixed it?
    • For me, the strict bare minimum would be to document the removal in the C API section of What's New in Python 3.10, so users can dig into the issue. It would be nice to give some instructions on how to port C extensions to Python 3.10 (without losing support for Python 3.9 and older).

I'm fine with not adding back the field.

@pablogsal
Copy link
Member Author

pablogsal commented Sep 21, 2021

I don't understand why do you want adding again the field if you consider that removing it was an incompatible change.

As I said, I opened the PR to have the discussion based on Mark's other PR. Not because I think this is the best solution.

IMO it's part of the public C API. The contact with users is that they should use things prefixed by an underscore. But PyThreadState.tracing has no underscore, so it can be used.

Where is says that "prefixed by an underscore" means "part of the public API". If that's the case we have violated that rule a lot of times on every release.

@jpe
Copy link

jpe commented Sep 21, 2021

We're talking about use_tracing and not tracing in thread state, correct? The tracing field is still used (I'm pretty sure) but the use_tracing field is not. FWIW, I consider adding a field to be binary incompatible, though it may be an acceptable incompatibility because no final 3.10 version has been released. I don't think the field needs to be re-added, though, even if there was more time.

@pablogsal
Copy link
Member Author

We're talking about use_tracing and not tracing in thread state, correct? The tracing field is still used (I'm pretty sure) but the use_tracing field is not. FWIW, I consider adding a field to be binary incompatible, though it may be an acceptable incompatibility because no final 3.10 version has been released. I don't think the field needs to be re-added, though, even if there was more time.

I am arguing the the whole structure is an implementation detail and therefore it can change at will between versions.

@jpe
Copy link

jpe commented Sep 21, 2021

It can (and did) change between 3.9 and 3.10. It shouldn't change between 3.10.1 and 3.10.2 because code that uses the field and is compiled against 3.10.2 will cause problems when the resulting binary is loaded into a 3.10.1 interpreter. This is why I call it an ABI change, though it may be acceptable when going from a rc to final.

What I'm having trouble figuring out is what is the rationale for adding it back in at this point -- is there something other than there are packages on pypi that won't work? Shouldn't the authors of the packages be updating them for 3.10, probably by using a newer version of cython?

@pablogsal
Copy link
Member Author

pablogsal commented Sep 21, 2021

It shouldn't change between 3.10.1 and 3.10.2 because code that uses the field and is compiled against 3.10.2 will cause problems when the resulting binary is loaded into a 3.10.1 interpreter.

We are on the same page here.

This is why I call it an ABI change, though it may be acceptable when going from a rc to final.

The change itself was done before we froze the ABI. We are discussing the implications of possibily adding it back.

What I'm having trouble figuring out is what is the rationale for adding it back in at this point -

Users that used the field in Python 3.9 and before complained that disappeared in 3.10 and therefore they cannot compile the extensions without changes.

@jpe
Copy link

jpe commented Sep 21, 2021

Users that used the field in Python 3.9 and before complained that disappeared in 3.10 and therefore they cannot compile the extensions without changes.

But the thread state structure does change from version to version, so extensions that use it need to be updated for 3.10. Has anyone uses this intentionally complained? The reasons to re-add it that I've seen seem to be about cython generated code.

@pablogsal
Copy link
Member Author

pablogsal commented Sep 21, 2021

But the thread state structure does change from version to version, so extensions that use it need to be updated for 3.10. Has anyone uses this intentionally complained?

I refer you to the issue to understand specifically what users are worried about and what was the usage of the field. For example, this comment: https://bugs.python.org/msg402150

@pablogsal
Copy link
Member Author

I discussed this particular instance with the Steering Council and the conclusion was that this field (use_tracing) is considered an implementation detail and therefore its removal it's justified so we won't be restoring it. I'm therefore closing this PR.

notice that this decision only affects this particular issue and should not be generalized to other fields or structures. We will try to determine and open a discusion in the future about what is considered public/private in these ambiguous cases and what can users expect regarding stability and backwards compatibility.

@pablogsal pablogsal closed this Sep 22, 2021
@jpe
Copy link

jpe commented Sep 22, 2021

But the thread state structure does change from version to version, so extensions that use it need to be updated for 3.10. Has anyone uses this intentionally complained?

I refer you to the issue to understand specifically what users are worried about and what was the usage of the field. For example, this comment: https://bugs.python.org/msg402150

I see this as a list of 3rd party packages that need to be updated for 3.10; I don't see anything about why the may have difficulty updating. I do think the change should be documented (it may be at this point, I'm not sure).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants