-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash in PyThreadState_DeleteCurrent
: drop_gil: GIL is not locked (free-threading)
#118727
Comments
I think the race here isn't specific to the thread deletion sequence; I forgot that edit: I was wrong; the thread is still attached during |
FWIW, this issue (and the PR) reinforce a worry I have about the complexity of Python's runtime state relative to threads. At a conceptual level the runtime has greater-than-one pieces of per-thread data for which the possible value classes/groupings (in combination) should map explicitly onto distinct operational states (while disallowing some value class combinations). We have meaningful gaps in identifying such a mapping and enforcing such constraints, which makes it hard to be confident about correctness (in such a critical part of the runtime), as well as making it easier to introduce unnecessary complexity. This observation is supported by the variety of defects we've seen relative to the GIL over the years, including recently, including this issue. I recognize all to personally how hard a hard problem this is, and it's not new with the free-threading work, but it's something we should be extra thoughtful about sooner to avoid additional headache later. FWIW, at one point I tried to tame this situation somewhat and codify it by adding In conclusion, we've reached a point in complexity (which free-threading generally amplifies meaningfully) that it may be worth taking some time to more thoroughly identify and validate the (effective) state machine for the Python runtime relative to threads. Otherwise I fear we'll regularly run into bugs like this indefinitely. |
I agree. One that I'd especially like to address (it's a minor one but I run into it frequently) is that I believe this store is redundant, based on my understanding of how |
…ad holds it (#118745) `drop_gil()` assumes that its caller is attached, which means that the current thread holds the GIL if and only if the GIL is enabled, and the enabled-state of the GIL won't change. This isn't true, though, because `detach_thread()` calls `_PyEval_ReleaseLock()` after detaching and `_PyThreadState_DeleteCurrent()` calls it after removing the current thread from consideration for stop-the-world requests (effectively detaching it). Fix this by remembering whether or not a thread acquired the GIL when it last attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()` instead of `gil->enabled`. This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've reenabled it.
…t thread holds it (pythonGH-118745) `drop_gil()` assumes that its caller is attached, which means that the current thread holds the GIL if and only if the GIL is enabled, and the enabled-state of the GIL won't change. This isn't true, though, because `detach_thread()` calls `_PyEval_ReleaseLock()` after detaching and `_PyThreadState_DeleteCurrent()` calls it after removing the current thread from consideration for stop-the-world requests (effectively detaching it). Fix this by remembering whether or not a thread acquired the GIL when it last attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()` instead of `gil->enabled`. This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've reenabled it. (cherry picked from commit be1dfcc) Co-authored-by: Brett Simmers <swtaarrs@users.noreply.github.com>
…nt thread holds it (GH-118745) (#119474) `drop_gil()` assumes that its caller is attached, which means that the current thread holds the GIL if and only if the GIL is enabled, and the enabled-state of the GIL won't change. This isn't true, though, because `detach_thread()` calls `_PyEval_ReleaseLock()` after detaching and `_PyThreadState_DeleteCurrent()` calls it after removing the current thread from consideration for stop-the-world requests (effectively detaching it). Fix this by remembering whether or not a thread acquired the GIL when it last attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()` instead of `gil->enabled`. This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've reenabled it. (cherry picked from commit be1dfcc) Co-authored-by: Brett Simmers <swtaarrs@users.noreply.github.com>
…t thread holds it (python#118745) `drop_gil()` assumes that its caller is attached, which means that the current thread holds the GIL if and only if the GIL is enabled, and the enabled-state of the GIL won't change. This isn't true, though, because `detach_thread()` calls `_PyEval_ReleaseLock()` after detaching and `_PyThreadState_DeleteCurrent()` calls it after removing the current thread from consideration for stop-the-world requests (effectively detaching it). Fix this by remembering whether or not a thread acquired the GIL when it last attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()` instead of `gil->enabled`. This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've reenabled it.
Crash report
cpython/Python/ceval_gil.c
Lines 232 to 239 in b9caa09
Found by running
./python Lib/test/test_importlib/partial/pool_in_threads.py
Note that in the coredump
gil->enabled
is0
, so it looks like the gil was transiently enabled by a different thread during the call to_PyThreadState_DeleteCurrent
.cc @swtaarrs
Linked PRs
drop_gil()
unless the current thread holds it #118745drop_gil()
unless the current thread holds it (GH-118745) #119474The text was updated successfully, but these errors were encountered: