-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful #87135
Comments
BACKGROUND
The PROBLEM
We're seeing this happen with C/C++ code. Our C++ builds with Fundamentally I do not believe the CPython VM should ever call The documentation suggests that all callers in user code of the four C-APIs with the documented CURRENT WORKAROUND (Big Hammer)Change CPython to call abort() instead of pthread_exit() as that situation is unresolvable and the process dying is better than hanging, partially alive. That solution isn't friendly, but is better than being silent and allowing deadlock. A failing process is always better than a hung process, especially a partially hung process. SEMI RELATED WORKhttps://bugs.python.org/issue42888 - appears to be avoiding some |
C-APIs such as They call pthread_exit() on failure today. Instead, they need to return an error to the calling application to indicate that "The Python runtime is no longer available." Callers need to act on that in whatever way is most appropriate to them. |
See also bpo-44434: "_thread module: Remove redundant PyThread_exit_thread() call to avoid glibc fatal error: libgcc_s.so.1 must be installed for pthread_cancel to work". New changeset 45a78f9 by Victor Stinner in branch 'main': |
See also a discussion about the usefulness of daemon threads: I'm more in favor of deprecating daemon threads (in any interpreter, not only in subinterpreters). The current implementation is too fragile. There are still corner cases like the one described in this issue. |
Another possible resolution would to simply make threads that attempt to acquire the GIL after Python starts to finalize hang (i.e. sleep until the process exits). Since the GIL can never be acquired again, this is in some sense the simplest way to fulfill the contract. This also ensures that any data stored on the thread call stack and referenced from another thread remains valid. As long as nothing on the main thread blocks waiting for one of these hung threads, there won't be deadlock. I have a case right now where a background thread (created from C++, which is similar to a daemon Python thread) acquires the GIL, and calls "call_soon_threadsafe" on an asycnio event loop. I think that causes some Python code internally to release the GIL at some point, after triggering some code to run on the main thread which happens to cause the program to exit. While |
The last time someone proposed to always call abort(), I proposed to add a hook instead: I added sys.unraisablehook. See bpo-36829. If we adopt this option, it can be a callback in C, something like: Py_SetThreadExitCallback(func) which would call func() rather than pthread_exit() in ceval.c. -- Another option would be to add an option to disable daemon thread. concurrent.futures has been modified to no longer use daemon threads: bpo-39812. It is really hard to write a reliable implementation of daemon threads with Python subintepreters. See bpo-40234 "[subinterpreters] Disallow daemon threads in subinterpreters optionally". There is already a private flag for that in subinterpreters to disallow spawning processes or threads: an "isolated" subintepreter. Example with _thread.start_new_thread(): PyInterpreterState *interp = _PyInterpreterState_GET();
if (interp->config._isolated_interpreter) {
PyErr_SetString(PyExc_RuntimeError,
"thread is not supported for isolated subinterpreters");
return NULL;
} Or os.fork(): if (interp->config._isolated_interpreter) {
PyErr_SetString(PyExc_RuntimeError,
"fork not supported for isolated subinterpreters");
return NULL;
} See also my article on fixing crashes with daemon threads: |
Regarding your suggestion of adding a hook like
The current behavior is (4) on POSIX platforms ( In general, achieving a clean shutdown will require the cooperation of all relevant code in the program, particularly code using the Python C API. Commonly the Python C API is used more by library code rather than application code, while it would presumably be the application that is responsible for setting this callback. Writing a library that supports multiple different thread shutdown behaviors would be particularly challenging. I think the callback is useful, but we would still need to discuss what the default behavior should be (hopefully different from the current behavior), and what guidance would be provided as far as what the callback is allowed to do. Option (1) is highly likely to result in a user-visible error --- a lot of Python programs that previously exited successfully will now, possibly only some of the time, exit with an error. The advantage is the user is alerted to the fact that some threads were not cleanly exited, but a lot of previously working code is now broken. This seems like a reasonable policy for a given application to impose (effectively requiring the use of an atexit handler to terminate all daemon threads), but does not seem like a reasonable default given the existing use of daemon threads. Option (2) would likely do the right thing in many cases, but main thread cleanup that was previously run would now be silently skipped. This again seems like a reasonable policy for a given application to impose, but does not seem like a reasonable default. Option (3) avoids the possibility of crashes and memory corruption. Since the thread stack remains allocated, any pointers to the thread stack held in global data structures or by other threads remain valid. There is a risk that the thread may be holding a lock, or otherwise block progress of the main thread, resulting in silent deadlock. That can be mitigated by registering an atexit handler. Option (4) in theory would allow cleanup handlers to be registered in order to avoid deadlock due to locks held. In practice, though, it causes a lot of problems:
Option (5) has the risk of memory corruption due to other threads accessing pointers to the freed thread stack. There is also the same risk of deadlock as in option (3). It avoids the problem of calls to Python C APIs in C++ destructors. I would consider this options strictly worse than option (3), since there is the same risk of deadlock, but the additional risk of memory corruption. We free the thread stack slightly sooner, but since the program is exiting soon anyway that is not really advantageous. The fact that the current behavior differs between POSIX and Windows is particularly unfortunate. I would strongly urge that the default behavior be changed to (3). If |
Regarding your suggestion of banning daemon threads: I happened to come across this bug not because of daemon threads but because of threads started by C++ code directly that call into Python APIs. The solution I am planning to implement is to add an I do think it is reasonable to suggest that users should ensure daemon threads are exited cleanly via an atexit handler. However, in some cases that may be challenging to implement, and there is also the issue of backward compatibility. |
PyThread_exit_thread() is exposed as _thread.exit() and _thread.exit_thread(). PyThread_exit_thread() is only called in take_gil() (at 3 places in the function) if tstate_must_exit(tstate) is true. It happens in two cases:
|
I don't think that there is a "good default behavior" where Python currently calls PyThread_exit_thread(). IMO we should take the problem from the other side and tries to reduce cases when Python can reach this case. Or even make it impossible if possible. For example, *removing* daemon threads would remove the most common case when Python has to call PyThread_exit_thread(). I'm not sure how to make this case less likely when threading._shutdown() is interrupted by CTRL+C. This function can likely hang if a thread is stuck for whatever reason. It's important than an user is able to "interrupt" or kill a stuck process with CTRL+C (SIGINT). It's a common expectation on Unix, at least for me. Maybe threading._shutdown() should be less nice and call os._exit() in this case: exit *immediately* the process in this case. Or Python should restore the default SIGINT handler: on Unix, the default SIGINT handler immediately terminate the process (like os._exit() does). I don't think that abort() should be called here (raise SIGABRT signal), since the intent of an user pressing CTRL+C is to silently terminate the process. It's not an application bug, but an user action. |
See also bpo-13077 "Windows: Unclear behavior of daemon threads on main thread exit". |
It looks like the From a search in the codebase, it appears Also, if it is changed to no longer kill the thread, it would probably make sense to rename it, e.g. to |
Oh right, I was confused by the function name: "thread_PyThread_exit_thread()". It's a good thing that it's not exposed in Python :-) |
I believe jbms is right that pausing the threads is the only right thing to do when they see tstate_must_exit. The PR is likely correct. |
I suppose calling However, with this change it also leaks threads. That is a bit unfortunate, but I suppose it is just another form of memory leak, and the user can avoid it by ensuring there are no daemon threads (of course even previously, the presence of any daemon threads meant additional memory leaking). |
I'm not comfortable with PR 28525 which always hang threads which attempt to acquire the GIL after Python exited. I would prefer to keep the current behavior by default, but give the ability to applications embedding Python to decide what to do. With my Py_SetThreadExitCallback(func) idea, PyThread_exit_thread() would call func() and then pthread_exit(0). Applications can hang threads, log a message, call abort(), or whatever else. I'm not comfortable with writing a portable function to "hang a thread". For example, I don't understand why PR 28525 processes Windows messages on hang threads. Well, it's a complex problem :-( |
A PR adding a What should it do when SetThreadExitCallback has already been called? Is that an error? Are the callbacks chained? In which order? If someone passes nullptr does that undo it (please no!). It is process global state that many things could wind up having an opinion on each with their own reason to require theirs to be the only one. I vote for returning an error if a callback has already been set. And not allowing unsetting a callback. What we'd do internally at work is always guarantee our codebase's early application startup code (because we have such a thing) calls that to setup whichever exit callback we deem appropriate for everyone instead of today's default deadlock potential. On pausing... agreed, it doesn't feel _comfortable_. To me when faced with a known potential deadlock situation the only comfortable thing is to abort() as a process dying is always more useful than process hanging (or worse, partially hanging). Sleeping on the problem, I don't really understand how |
In general, I view hanging threads as the least bad thing to do when faced with either acquiring the GIL or not returning at all. There is a lot of existing usage of Python that currently poses a risk of random crashes and memory corruption while Python is exiting, and I would like to fix that. However, I would certainly recommend that code using the Python C API attempt to avoid threads getting to that state in the first place. I added a "finalize block" mechanism to that PR which is intended to provide a way to attempt to acquire the GIL in a way that ensures the GIL won't get hung. I would welcome feedback on that. A common use case for that API might be a non-Python created thread that wants to invoke some sort of asynchronous callback handler via Python APIs. For Python daemon threads that you control, you can avoid them hanging by registering an atexit function that signals them to exit and then waits until they do. vsteinner: Regarding processing the Windows messages, I updated the PR to include a link to the MSDN documentation that led me to think it was a good idea. vstinner: As for random code outside of Python itself that is using gps: The reasons I believe hanging the thread is better than
Those are the additional problems specific to
I don't think hanging the thread really increases the risk of deadlock over the status quo. In theory someone could have a C++ destructor that does some cleanup that safely pevents deadlock, but that is not portable to Windows, and I expect that properly implemented I think we would want to ensure that Python itself is implemented in such a way as to not deadlock if Python-created threads with only Python functions in the call stack hang. Mostly that would amount to not holding mutexes while calling functions that may transitively attempt to acquire the GIL (or release and then re-acquire the GIL). That is probably a good practice for avoiding deadlock even when not finalizing. We would also want to document that external code using the Python API should be safe from deadlock if a thread hangs, or should use the new "finalize block" mechanism to ensure that it doesn't hang, but I would say that is much easier to achieve than Regarding vstinner's point that we would leak hung threads in an application that embeds Python and keeps running after Adding a |
This problem also remains me the very complex case of bpo-6721: "Locks in the standard library should be sanitized on fork". The issue title looks simple, but 12 years after the issue was created, it's still open. This issue is being solved by adding atfork callbacks to modules which must do something at fork in the child process (sometimes also in the parent process). I added threading.Lock._at_fork_reinit() private method to simplify the implementation of these callbacks. Such problem has no silver bullet solution, so it's better to let developers design their own solution with their specific requirements. |
Gregory P. Smith: Python has many API using callbacks: PEP-445 added PyMem_SetAllocator() to set memory allocator. Adding PyMem_GetAllocator() also made possible to chain allocators and to "hook" into an existing allocator to execute code before and after it's called (the PEP contains an example). I'm not sure if it's important or useless to chain callbacks with Py_SetThreadExitCallback(). I suggest to always override the previously set callback. It would matter if library A sets a callback to emit log and library B sets a callback to hang threads. It maybe be nice to first emit a log and then hang the thread. But then the order in which callbacks are set starts to matter a lot :-) I'm fine with adding Py_GetThreadExitCallback() if you consider that it matters.
I don't think that we should bother with adding a special case. I prefer to consider developers as adults and let them make their own mistakes if they consider that they understand the code well enough ;-) _PyEval_SetTrace() allows to remove the current trace function. It's a legit use case. If library C is annoyed by library A and library B installed annoying callbacks, IMO it's also ok to let it "remove" the previously set callback, no? IMO Py_SetThreadExitCallback(NULL) should simply set the callback to NULL, so restore the default behavior: call pthread_exit(). |
Another example where a developer asks to call abort() to notice bugs, whereas Python previously silently ignored it: bpo-36829. Calling abort() is a legit use case, but not really the best default behavior. Again, the problem was solved by letting developers setting their own callback: sys.unraisablehook. If I understood correctly, pytest doesn't override it but "took" into the default implementation: it chains its own code with the default implementation. It's possible because there is a way to "get" the current hook: just read sys.unraisablehook ;-) Another argument in favor of also adding Py_GetThreadExitCallback() ;-) |
Jeremy Maitin-Shepard: "In general, I view hanging threads as the least bad thing to do when faced with either acquiring the GIL or not returning at all. There is a lot of existing usage of Python that currently poses a risk of random crashes and memory corruption while Python is exiting, and I would like to fix that." Showing warnings by default or not was discussed many times in Python. It was decided to *hide* DeprecationWarning by default. The PEP-565 is a minor trade-off to show them in the __main__ module. For me, more generally, Python default behavior is designed for *users* who don't want to be annoyed by warnings or anything which would make sense for *developers*. That's why I designed a new "Python Development Mode" (-X dev): Maybe in development mode, the behavior could be changed to call abort(). But honestly, I'm not really excited by that. I'm not embedding Python in a C++ application. I'm almot only use Python directly: the Unix command "python3". For this use case, IMO it's fine to call pthread_exit() by default. |
…g finalization (GH-105805) Instead of surprise crashes and memory corruption, we now hang threads that attempt to re-enter the Python interpreter after Python runtime finalization has started. These are typically daemon threads (our long standing mis-feature) but could also be threads spawned by extension modules that then try to call into Python. This marks the `PyThread_exit_thread` public C API as deprecated as there is no plausible safe way to accomplish that on any supported platform in the face of things like C++ code with finalizers anywhere on a thread's stack. Doing this was the least bad option. Co-authored-by: Gregory P. Smith <greg@krypto.org>
merged for 3.14. Now considering if we can apply a version of that PR to bugfix releases such as 3.13.1 and 3.12.8. |
@gpshead, do you have any update on the consideration? |
I haven't had time, feel free to dig in. |
* hang instead of pthread_exit during interpreter shutdown see python/cpython#87135 and rust-lang/rust#135929 * relnotes * fix warnings * version using pthread_cleanup_push * add tests * new attempt * clippy * comment * msrv * address review comments * update comment * add comment --------- Co-authored-by: Ariel Ben-Yehuda <arielby@amazon.com>
…daemon thread If `Py_IsFinalizing()` is true, non-daemon threads (other than the current one) are done, and daemon threads are prevented from running, so they cannot finalize themselves and become done. Joining them without timeout would block forever. Raise PythonFinalizationError instead of hanging. See pythongh-123940 for a use case: calling `join()` from `__del__`. This is ill-advised, but an exception should at least make it easier to diagnose.
Having read through the discussion, I lean toward not backporting the fix. It's a behaviour change; the fixed version might call for different workarounds in user code. However, seeing #123940 and the If so we can also do this for |
…daemon thread If `Py_IsFinalizing()` is true, non-daemon threads (other than the current one) are done, and daemon threads are prevented from running, so they cannot finalize themselves and become done. Joining them without timeout would block forever. Raise PythonFinalizationError instead of hanging. See pythongh-123940 for a use case: calling `join()` from `__del__`. This is ill-advised, but an exception should at least make it easier to diagnose.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: