Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cython cannot use "pycore_frame.h" in Py3.14a4 #130931

Open
scoder opened this issue Mar 6, 2025 · 36 comments
Open

Cython cannot use "pycore_frame.h" in Py3.14a4 #130931

scoder opened this issue Mar 6, 2025 · 36 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@scoder
Copy link
Contributor

scoder commented Mar 6, 2025

Bug report

Bug description:

It's this time of the year again. As discussed before (e.g. #123747), Cython #includes CPython's pycore_frame.h to get access to certain frame features, e.g. their integration into tracebacks.

I noticed that Cython generated code doesn't compile in CPython 3.14 alpha with the following kind of errors:

…\Python\3.14.0-alpha.4\x64\include\internal\pycore_stackref.h(278): error C7555: use of designated initializers requires at least '/std:c++20'
…\Python\3.14.0-alpha.4\x64\include\internal/pycore_frame.h(196): error C7555: use of designated initializers requires at least '/std:c++20'
…\Python\3.14.0-alpha.4\x64\include\internal/pycore_frame.h(196): error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax
…\Python\3.14.0-alpha.4\x64\include\internal/pycore_frame.h(378): error C7555: use of designated initializers requires at least '/std:c++20'
…\Python\3.14.0-alpha.4\x64\include\internal/pycore_frame.h(378): error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax
…\Python\3.14.0-alpha.4\x64\include\internal/pycore_frame.h(379): error C7555: use of designated initializers requires at least '/std:c++20'
…\Python\3.14.0-alpha.4\x64\include\internal/pycore_frame.h(379): error C4576: a parenthesized type followed by an initializer list is a non-standard explicit type conversion syntax

This is from one of our Windows CI runs. Py3.14a4 seems to be what Github Actions currently provides. Cython code expects C99 and something close to (but not necessarily as complete as) C++11 as compiler standards. It targets both C and C++, depending on user needs.

The dependency on pycore_stackref.h was apparently added in #118450:

22b0de2755e [2024-06-27 03:10:43 +0800] GitHub | gh-117139: Convert the evaluation stack to stack refs (#118450)

diff --git a/Include/internal/pycore_frame.h b/Include/internal/pycore_frame.h
index bab92c771a7..1e0368faa5b 100644
--- a/Include/internal/pycore_frame.h
+++ b/Include/internal/pycore_frame.h
@@ -11,6 +11,7 @@ extern "C" {
 #include <stdbool.h>
 #include <stddef.h>               // offsetof()
 #include "pycore_code.h"          // STATS
+#include "pycore_stackref.h"      // _PyStackRef
 
 /* See Objects/frame_layout.md for an explanation of the frame stack
  * including explanation of the PyFrameObject and _PyInterpreterFrame
@@ -67,7 +68,7 @@ typedef struct _PyInterpreterFrame {
     uint16_t return_offset;  /* Only relevant during a function call */
     char owner;
     /* Locals and stack */
-    PyObject *localsplus[1];
+    _PyStackRef localsplus[1];
 } _PyInterpreterFrame;
 
 #define _PyInterpreterFrame_LASTI(IF) \

What can we do to resolve this?

CPython versions tested on:

3.14

Operating systems tested on:

Windows

Linked PRs

@scoder scoder added the type-bug An unexpected behavior, bug, or error label Mar 6, 2025
@picnixz picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Mar 7, 2025
@scoder
Copy link
Contributor Author

scoder commented Mar 7, 2025

@encukou @vstinner

@encukou
Copy link
Member

encukou commented Mar 7, 2025

Added by @Fidget-Spinner & @colesbury, back in June.

What kind of public/unstable API would Cython need so it could avoid the internal headers?

@Fidget-Spinner
Copy link
Member

I think we should be able to make it compliant with the C compiler Cython is using

@scoder
Copy link
Contributor Author

scoder commented Mar 7, 2025

What kind of public/unstable API would Cython need so it could avoid the internal headers?

Cython directly accesses frames in three places:

  • profiling/tracing
  • exceptions/tracebacks
  • coroutines

We already got rid of direct frame usage for tracing in Py3.13+ by migrating to the new monitoring C-API. Sadly, that currently excludes coverage reporting since coverage.py cannot use it with plugins yet (such as the Cython coverage plugin). That'll eventually come, I hope, but it's not a CPython issue. It just means that you cannot currently use coverage reporting of Cython code in Python 3.14, as long as coverage.py needs this (old, pre-monitoring) code that requires pycore_frame.h (which doesn't currently compile in C++11):

https://github.com/cython/cython/blob/f193ba860b94a7139cffd8b0d11b7ef977aa10d3/Cython/Utility/Profile.c#L629-L639

For exceptions, all we'd need (AFAICT) is a PyFrame_SetLineNumber(frame, lineno). This has been mentioned before in #118720 but was never implemented.

The coroutine support then needs to inject frames into the current stack. For this, we currently read and write f_back:

https://github.com/cython/cython/blob/f193ba860b94a7139cffd8b0d11b7ef977aa10d3/Cython/Utility/Coroutine.c#L784-L831

https://github.com/cython/cython/blob/f193ba860b94a7139cffd8b0d11b7ef977aa10d3/Cython/Utility/Coroutine.c#L870-L897

There was a Cython ticket about this created by @markshannon a while ago that also never got anywhere.
cython/cython#4484

@Fidget-Spinner
Copy link
Member

@scoder could you please point out the C++ standard violations in your CPython alpha? The line number you posted for the alpha doesn't seem to match up with the current source. I think there's been some changes between then. Thanks!

@scoder
Copy link
Contributor Author

scoder commented Mar 7, 2025

The line number you posted for the alpha doesn't seem to match up with the current source.

Yeah, this output was copied from a Github Actions run of MSVC, which notoriously reports randomly displaced line numbers. I regularly end up looking broadly around the reported lines to guess my way towards potential code candidates for the reported errors and warnings. I have no idea how people can make productive use of such a compiler.

I actually failed to find any designated initializers in pycore_frame.h myself, but they are definitely used in pycore_stackref.h. The issue then is that they long existed in C99 but only came into C++ very late, in C++20. Usually not an issue for internal headers in a C project like CPython these days, unless someone needs to access internal headers from C++ code.

@Fidget-Spinner
Copy link
Member

Ok if it's just removing the designated initializers that should be easy enough to do. I'll take this up.

@Fidget-Spinner Fidget-Spinner self-assigned this Mar 7, 2025
@encukou
Copy link
Member

encukou commented Mar 7, 2025

#define PyStackRef_True ((_PyStackRef){.bits = (uintptr_t)&_Py_TrueStruct })
#define PyStackRef_False ((_PyStackRef){.bits = ((uintptr_t)&_Py_FalseStruct) })
#define PyStackRef_None ((_PyStackRef){.bits = ((uintptr_t)&_Py_NoneStruct) })

I'm afraid the compiler complaints are about two features: C7555 is for designated initializers but C4576 is for compound literals (the cast-like syntax).

@Fidget-Spinner
Copy link
Member

I think I'm going to wait for @markshannon to land his PR on stackrefs first.

@Fidget-Spinner
Copy link
Member

@encukou the cast-like syntax is going to be hard to fix without losing some performance on msvc. I'm not sure what to do for those. Any thoughts?

@encukou
Copy link
Member

encukou commented Mar 7, 2025

Expose the frame struct (and only that) more? Unstable API is probably the best place for it given Cython's usage; unfortunately #31530 predated PEP-689.

So, rename struct _frame to PyUnstable_FrameObject, move it to Include/cpython/frameobject.h and document it?

@encukou
Copy link
Member

encukou commented Mar 7, 2025

(I'm happy to do that, especially the docs part, if that seems right to y'all)

@Fidget-Spinner
Copy link
Member

Seems right to me.

@Fidget-Spinner Fidget-Spinner removed their assignment Mar 7, 2025
@colesbury
Copy link
Contributor

I'm not sure I understand the renaming of struct _frame to PyUnstable_FrameObject. struct _frame is already PyFrameObject and there's already a bunch of public APIs that use PyFrameObject *:

typedef struct _frame PyFrameObject;

/* Return the line of code the frame is currently executing. */
PyAPI_FUNC(int) PyFrame_GetLineNumber(PyFrameObject *);
PyAPI_FUNC(PyCodeObject *) PyFrame_GetCode(PyFrameObject *frame);

@vstinner
Copy link
Member

vstinner commented Mar 7, 2025

We should add getters and setters, but not expose the structure members in the public C API.

@vstinner
Copy link
Member

For exceptions, all we'd need (AFAICT) is a PyFrame_SetLineNumber(frame, lineno). This has been mentioned before in #118720 but was never implemented.

It's possible to call PyObject_SetAttrString(frame, "f_lineno", new_lineno) but it can only be called from a trace function. I suppose that you need a function which only updates frame->f_lineno without validating new_lineno nor updating the instruction pointer.

@Fidget-Spinner
Copy link
Member

@vstinner we cant use PyObject_SetAttrString because frame is not a PyObject since 3.11

@vstinner
Copy link
Member

@vstinner we cant use PyObject_SetAttrString because frame is not a PyObject since 3.11

I'm talking about PyFrameObject ("frame") objects: PyFrameObject structure starts with PyObject_HEAD. So you can use PyObject_SetAttrString() on it, no?

Are you talking about _PyInterpreterFrame?

@Fidget-Spinner
Copy link
Member

Yes I think the current problem is in _PyInterpreterFrame not PyFrameObject

@vstinner
Copy link
Member

I wrote PR gh-131246 to add int PyFrame_SetLineNumber(PyFrameObject *frame, int lineno).

@vstinner
Copy link
Member

Yes I think the current problem is in _PyInterpreterFrame not PyFrameObject

Cython works on PyFrameObject. Cython/Utility/ModuleSetupCode.c defines:

#define __Pyx_PyFrame_SetLineNumber(frame, lineno)  (frame)->f_lineno = (lineno)

vstinner added a commit to vstinner/cpython that referenced this issue Mar 14, 2025
Move _PyInterpreterFrame and associated functions to a new
pycore_interpframe.h header.
@vstinner
Copy link
Member

@scoder @da-woods: I wrote PR gh-131249 to move _PyInterpreterFrame and associated functions to a new pycore_interpframe.h header. Would Cython be able to use <pycore_frame.h> without <pycore_interpframe.h>? Does Cython need access to PyFrameObject members, or also to _PyInterpreterFrame members?

vstinner added a commit to vstinner/cpython that referenced this issue Mar 14, 2025
Add tests on PyFrame_GetBack() and PyFrame_SetBack().
vstinner added a commit to vstinner/cpython that referenced this issue Mar 14, 2025
Add tests on PyFrame_GetBack() and PyFrame_SetBack().
@vstinner
Copy link
Member

@scoder:

The coroutine support then needs to inject frames into the current stack. For this, we currently read and write f_back:
https://github.com/cython/cython/blob/f193ba860b94a7139cffd8b0d11b7ef977aa10d3/Cython/Utility/Coroutine.c#L784-L831
https://github.com/cython/cython/blob/f193ba860b94a7139cffd8b0d11b7ef977aa10d3/Cython/Utility/Coroutine.c#L870-L897

Aha, I created PR gh-131252 to add PyFrame_SetBack() function.

@da-woods
Copy link
Contributor

I wrote PR gh-131249 to move _PyInterpreterFrame and associated functions to a new pycore_interpframe.h header. Would Cython be able to use <pycore_frame.h> without <pycore_interpframe.h>? Does Cython need access to PyFrameObject members, or also to _PyInterpreterFrame members?

I'm fairly sure we just use PyFrameObject and nothing from _PyInterpreterFrame. I gave the Cython tests a run through with your PR and there didn't seem to be any problems that looked related to it.

vstinner added a commit to vstinner/cpython that referenced this issue Mar 18, 2025
Declare _PyInterpreterFrame and _PyRuntimeState types before
declaring their structure members. Break reference cycles between
header files.
vstinner added a commit to vstinner/cpython that referenced this issue Mar 18, 2025
Move _PyInterpreterFrame and associated functions to a new
pycore_interpframe.h header.
vstinner added a commit to vstinner/cpython that referenced this issue Mar 18, 2025
Move _PyInterpreterFrame and associated functions to a new
pycore_interpframe.h header.
vstinner added a commit that referenced this issue Mar 19, 2025
Declare _PyInterpreterFrame and _PyRuntimeState types before
declaring their structure members. Break reference cycles between
header files.
vstinner added a commit to vstinner/cpython that referenced this issue Mar 19, 2025
Move _PyInterpreterFrame and associated functions to a new
pycore_interpframe.h header.
@markshannon
Copy link
Member

I don't think making internal headers usable by anything other than CPython is a good idea.
There are too many subtle invariants, lazily set fields, and fields whose meaning differs from the corresponding attribute.

We also change those headers frequently, and if Cython relies on these headers, then Cython is perpetually broken on main.
Which means that there are lots of packages that don't work on main, so we can use them to check for performance regressions or breakages.

If there is functionality that is needed, then please request a new API.
There is even an old Cython issue for this: cython/cython#4484
We've happy to implement these features, but you need to ask.

cython/cython#4382

vstinner added a commit to vstinner/cpython that referenced this issue Mar 19, 2025
Move _PyInterpreterFrame and associated functions
to a new pycore_interpframe.h header.
@vstinner
Copy link
Member

Cython only uses pycore_frame.h of the internal C API:

$ git grep pycore_
Cython/Utility/Coroutine.c:  #include "internal/pycore_frame.h"
Cython/Utility/Exceptions.c:  #include "internal/pycore_frame.h"
Cython/Utility/Profile.c:  #include "internal/pycore_frame.h"

I agree that it's a bad practice to use the internal C API outside CPython in general, and that's why I proposed two PRs to add functions needed by Cython:

The problem is that it will take time to get these PRs merged, get Cython updated to use them, get a Cython release, and then update code generated by Cython in projects using Cython.

That's why I also proposed #131249 to simplify pycore_frame.h and make it usable again by Cython. It's a short term solution until the the long term solution is implemented and usable in released Cython.

@markshannon
Copy link
Member

Those setter functions are just as bad.

Setting the back pointer of a frame object is not, in general, possible as frames are mostly allocated in a contiguous block of memory.
Setting the line number is a jump. That is how jumps are performed in the debugger. frame.f_lineno = ...

@markshannon
Copy link
Member

That's why I also proposed #131249 to simplify pycore_frame.h and make it usable again by Cython.

I think #131249 is probably the best fix right now. I would prefer not to add those setter functions.

vstinner added a commit that referenced this issue Mar 19, 2025
Move _PyInterpreterFrame and associated functions
to a new pycore_interpframe.h header.
@vstinner
Copy link
Member

@scoder, @da-woods: I merged a change to fix the issue: 5c44d7d. Would you be able to test if it does fix your C++ issue?

@da-woods
Copy link
Contributor

@scoder, @da-woods: I merged a change to fix the issue: 5c44d7d. Would you be able to test if it does fix your C++ issue?

Yes it sees OK thanks.

@encukou
Copy link
Member

encukou commented Mar 20, 2025

For exceptions, should we expose something like the existing _PyTraceback_Add?

/* Insert a frame into the traceback for (funcname, filename, lineno). */
void _PyTraceback_Add(const char *funcname, const char *filename, int lineno)

That could create a synthetic code object, a “finished” frame, and traceback entry.
It seems it would be useful Emscripten/Pyodide as well (cc @hoodmane).

Then, when this use case doesn't need users calling PyFrame_New & _PyFrame_SetLineNumber, we could start further separating traceback entries from interpreter frames.
Could we teach interpreter internals to handle PyFrameObject->f_frame = NULL gracefully, and PyFrame_GetCode to reach for a new field in that case? (PyFrame_GetGlobals & co. can just make an empty dict.)
Then maybe add a subclass of PyInterpreterFrame for “injected” frames, which stores its funcname/filename/lineno, and creates a fake code object on demand to not break PyFrame_GetCode? Teach traceback machinery to bypass the code objects?

@markshannon
Copy link
Member

I'd rather push a normal frame with "function like" and "code like" objects.
We already need to support "code like" objects for monitoring, #131414, so adding a "function like" object shouldn't be a big deal.
The API would then be:

  • int Py_PushExtensionFrame(PyThreadState *tstate, PyObject *funclike, PyObject *codelike) for pushing a frame,
  • void Py_PopExtensionFrame(PyThreadState *tstate) for popping a frame, and
  • int PyTraceBack_AtLine(int line) for adding the traceback entry for the current frame.

No need for a fake code object.

@encukou
Copy link
Member

encukou commented Mar 21, 2025

@da-woods probably knows better, but I believe there's some value in an API that's only called when unwinding an exception. I don't think Cython will want to allocate function-like & code-like objects for every call.
IOW, with your proposal, PyTraceback_Add would still be useful as a higher-level convenience function, creating the objects and calling Push+AtLine+Pop.

@markshannon
Copy link
Member

Cython can push the frame only when an exception is raised with the above API, but we can add a PyTraceback_Add function if you think it will be useful.

However, it should take function-like and code-like parameters for general consistency and integration with sys.monitoring which will fire a RAISE event.

int PyTraceback_Add(PyObject *funclike, PyObject *codelike, int line)
{
    PyThreadState *tstate = PyThreadState_GET();
    if ((Py_PushExtensionFrame(tstate, funclike, codelike) < 0) return -1;
    int err = PyTraceBack_AtLine(line);
    Py_PopExtensionFrame(tstate);
    return err;
}

@encukou
Copy link
Member

encukou commented Mar 21, 2025

That's probably even better for Cython, which uses a cache to avoid creating duplicate code objects!

I assume PyCode_NewEmpty() can create a sufficient code-like object. (Or just a code object? The annotations in sys.monitoring docs use types.CodeType directly.)
What are CPython's requirements/expectations for “function-like objects”?

@markshannon
Copy link
Member

What are CPython's requirements/expectations for “function-like objects”?

Very little, I think merely having a __module__ attribute will be sufficient.

It isn't CPython that we need to worry about, but coverage, debuggers and other tools. What do they need?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

8 participants