Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion errors/stale cache in typeobject.c when tp_version_tag is not cleared. #99293

Open
Yhg1s opened this issue Nov 9, 2022 · 7 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@Yhg1s
Copy link
Member

Yhg1s commented Nov 9, 2022

The simple method cache in typeobject.c uses the tp_version_tag field and the Py_TPFLAGS_VALID_VERSION_TAG bit in the tp_flags field to track changes on the type that might invalidate the cache. Unfortunately the code currently assumes that any code clearing the tp_flag also clears the tp_version_tag, and checks this assumption with an assert. Unfortunately, many SWIG-generated extension modules violate this assertion.

SWIG prior to 4.1.0 cleared the tp_flags bit without touching the tp_version_tag field (https://github.com/swig/swig/blob/v4.0.2/Lib/python/pyrun.swg#L1230). This was unintentionally fixed in 4.1.0 by using PyType_Modified() instead, but unfortunately there are a great many SWIG-generated extension modules checked in all over the place (e.g. https://github.com/OSGeo/gdal/blob/6bd07b20b3e55c2fc94da611244a615a4fd2991f/swig/python/extensions/osr_wrap.cpp#L2296) so we cannot assume tp_version_tag is changed when the tp_flag bit is cleared.

The end result is an assertion error when assertions are enabled (I would implore everyone to run tests with assertions enabled...), or potentially a stale cache entry when they are not. It's hard to gauge the likelihood of the stale cache entries. In the case of SWIG it's extremely unlikely, as SWIG merely sets the this attribute on the type, which is not likely to be cached or looked up with _PyType_Lookup. For other cases where extension modules clear the tp_flag bit without changing the tp_version_tag it might be more likely.

(I have a PR to fix this already.)

@Yhg1s Yhg1s added the type-crash A hard crash of the interpreter, possibly with a core dump label Nov 9, 2022
@Fidget-Spinner
Copy link
Member

Two problems:

  • 3.11's specialised instructions only use the version as a guard. This can be fixed if we guard the flag.

  • If SWIG just reset the flag without calling PyType_Modified, thats a bug. PyType_Modified also invalidates the type version for all subclasses of a modified type. So the current SWIG code would return wrong cached values even in pre-3.10 for subclasses that inherit a SWIG C type. There is no easy way to fix this on CPython side. In fact I rather not fix it as errors should never pass silently. This can cause really obscure bugs if we allow it through.

TLDR this is a nasty bug. There is no complete fix for CPython that accounts for all the intricacies of the method cache. The only safe thing is for SWIG to update itself.

@carljm
Copy link
Member

carljm commented Nov 9, 2022

I guess it still makes sense to "fix" the specialized instructions in 3.11 so that this always-broken SWIG code isn't more broken than it was pre-3.11.

I think for 3.12 we should consider just removing Py_TPFLAGS_VALID_VERSION_TAG entirely. It is both undocumented and entirely redundant. There's no good reason anyone should write code that uses/sets or checks it.

@Yhg1s
Copy link
Member Author

Yhg1s commented Nov 9, 2022

Fixing SWIG doesn't fix existing extension modules, unfortunately. Many projects check in SWIG-generated code. Some even edit it afterwards. (I think I cried when I found that out.) Also, this isn't just about SWIG, as other extension modules can do the same kind of thing. The problem is they would work right in Python 3.9 but not in 3.10 -- and worse, it wouldn't be apparent that it's not working right, because in most cases it doesn't matter. I imagine the specialisations in 3.11 and later make it more likely for things to go subtly wrong, but again it'll be situational and probably somewhat unlikely to be detected (not to mention a dickens to debug).

Removing Py_TPFLAGS_VALID_VERSION_TAG is, somewhat ironically, a much easier change to deal with, because it is immediately a compile error, rather than a silently lurking bug. (We still have backward compatibility to contend with.)

@markshannon
Copy link
Member

There is a deeper issue here. We should be rejecting extensions that are binary incompatible with the version of CPython that is loading them.

We should reject extensions compiled for a different version, unless they are using the stable ABI.
I'm surprised we don't have a version number for this. It shouldn't be too hard to compile a version number from header files into the DLL. Or am I missing something?

We should definitely remove Py_TPFLAGS_VALID_VERSION_TAG for 3.12.

@Yhg1s
Copy link
Member Author

Yhg1s commented Nov 10, 2022

We used to have version numbers in extension modules (and to some extent still do) but that's not relevant here. This is a source incompatibility. Rebuilding doesn't change anything.

Removing the tp_flag in 3.12 will require a PEP 387 exemption.

@markshannon
Copy link
Member

I was looking into deprecating Py_TPFLAGS_VALID_VERSION_TAG, but there are no references to it in the docs.

IMO, that means that Py_TPFLAGS_VALID_VERSION_TAG is undefined and this is a SWIG bug, not a CPython one.

I don't think we need to remove the flag right now, but we should definitely documented that it shouldn't be used.

@Yhg1s
Copy link
Member Author

Yhg1s commented Feb 9, 2023

I will reiterate that this assert triggers in real world code, and a fair bit of it. (We initially worked around it by fixing the code in our version of SWIG, but we ran into quite a bit of checked-in, and sometimes checked-in-and-hand-edited, SWIG code in third-party projects.) I do not think it is a good idea to invalidate that many SWIG-generated extension modules for the sake of skipping one bitflag check, especially considering SWIG wasn't fixed until October last year (and upgrading SWIG is not always easy).

Yhg1s pushed a commit that referenced this issue Feb 9, 2023
…sed. (#GH-101736)

Document that Py_TPFLAGS_VALID_VERSION_TAG shouldn't be used.
carljm added a commit to carljm/cpython that referenced this issue Feb 9, 2023
* main: (82 commits)
  pythongh-101670: typo fix in PyImport_ExtendInittab() (python#101723)
  pythonGH-99293: Document that `Py_TPFLAGS_VALID_VERSION_TAG` shouldn't be used. (#pythonGH-101736)
  no-issue: Add Dong-hee Na as the cjkcodecs codeowner (pythongh-101731)
  pythongh-101678: Merge math_1_to_whatever() and math_1() (python#101730)
  pythongh-101678: refactor the math module to use special functions from c11 (pythonGH-101679)
  pythongh-85984: Remove legacy Lib/pty.py code. (python#92365)
  pythongh-98831: Use opcode metadata for stack_effect() (python#101704)
  pythongh-101283: Version was just released, so should be changed in 3.11.3 (pythonGH-101719)
  pythongh-101283: Fix use of unbound variable (pythonGH-101712)
  pythongh-101283: Improved fallback logic for subprocess with shell=True on Windows (pythonGH-101286)
  pythongh-101277: Port more itertools static types to heap types (python#101304)
  pythongh-98831: Modernize CALL and family (python#101508)
  pythonGH-101696: invalidate type version tag in `_PyStaticType_Dealloc` (python#101697)
  pythongh-100221: Fix creating dirs in `make sharedinstall` (pythonGH-100329)
  pythongh-101670: typo fix in PyImport_AppendInittab() (pythonGH-101672)
  pythongh-101196: Make isdir/isfile/exists faster on Windows (pythonGH-101324)
  pythongh-101614: Don't treat python3_d.dll as a Python DLL when checking extension modules for incompatibility (pythonGH-101615)
  pythongh-100933: Improve `check_element` helper in `test_xml_etree` (python#100934)
  pythonGH-101578: Normalize the current exception (pythonGH-101607)
  pythongh-47937: Note that Popen attributes are read-only (python#93070)
  ...
@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

5 participants