Add support for free-threading builds of CPython #243

dpdani · 2024-11-17T15:44:33Z

Compile fails on 3.14 no-GIL #231

# Conflicts: # .github/workflows/test.yml # pyproject.toml

ngoldbaum · 2024-11-19T16:57:06Z

In addition to the failure you left a comment for, I also see a different failure which is more obviously a thread safety issue in hypothesis itself:

self = <tests.test_decompressor_fuzzing.TestDecompressor_stream_reader_fuzzing testMethod=test_buffer_source_constant_read_size>

    @hypothesis.settings(
>       suppress_health_check=[hypothesis.HealthCheck.large_base_example]
    )

tests/test_decompressor_fuzzing.py:145:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

state = <hypothesis.core.StateForActualGivenExecution object at 0x4b3c0220190>
wrapped_test = <function accept.<locals>.test_buffer_source_constant_read_size at 0x4b3b4dc8fc0>, arguments = ()
kwargs = {'self': <tests.test_decompressor_fuzzing.TestDecompressor_stream_reader_fuzzing testMethod=test_buffer_source_constant_read_size>}
original_sig = <Signature (self, original, level, streaming, source_read_size, read_size)>

    def execute_explicit_examples(state, wrapped_test, arguments, kwargs, original_sig):
        assert isinstance(state, StateForActualGivenExecution)
        posargs = [
            p.name
            for p in original_sig.parameters.values()
>           if p.kind is p.POSITIONAL_OR_KEYWORD
        ]
E       RecursionError: maximum recursion depth exceeded

../../.pyenv/versions/3.13.0t/lib/python3.13t/site-packages/hypothesis/core.py:433: RecursionError

Hypothesis even warns about this:

tests/test_decompressor_fuzzing.py: 17 warnings
  /Users/goldbaum/.pyenv/versions/3.13.0t/lib/python3.13t/site-packages/hypothesis/core.py:822: HypothesisWarning: The recursion limit will not be reset, since it was changed from another thread or during execution of a test.
    with ensure_free_stackframes():

tests/test_decompressor_fuzzing.py: 17 warnings
  /Users/goldbaum/.pyenv/versions/3.13.0t/lib/python3.13t/site-packages/hypothesis/internal/conjecture/engine.py:316: HypothesisWarning: The recursion limit will not be reset, since it was changed from another thread or during execution of a test.
    with ensure_free_stackframes():

tests/test_decompressor_fuzzing.py: 10 warnings
  /Users/goldbaum/.pyenv/versions/3.13.0t/lib/python3.13t/site-packages/hypothesis/core.py:643: HypothesisWarning: The recursion limit will not be reset, since it was changed from another thread or during execution of a test.
    with ensure_free_stackframes():

So I suspect the other failure is caused by a similar problem happening.

And indeed looking at the hypothesis docs, they do not support running hypothesis simultaneously in multiple threads:

https://hypothesis.readthedocs.io/en/latest/details.html#thread-safety-policy

I'll open an issue to document this in the free-threading porting guide and an issue on pytest-run-parallel to hopefully detect this and warn about it.

I think the fix for the python-zstandard tests is to mark any tests using hypothesis as thread-unsafe.

ngoldbaum · 2024-11-19T17:06:34Z

It's also probably worth adding some tests for sharing a compressor or decompressor between threads. It looks like the GIL does get released when calling into the C zstd library, so any C-level thread safety issues that exist from sharing zstd contexts between threads are probably also present in the GIL-enabled build and no one has reported them yet.

dpdani · 2024-11-19T17:53:12Z

mm, it's weird that I'm not seeing that.

how are you running the tests?
I'm doing act -W .github/workflows/test.yml

ngoldbaum · 2024-11-19T17:58:38Z

I'm running it all locally on my mac dev machine. I installed the library with pip install -e . then ran ZSTD_SLOW_TESTS=1 pytest --numprocesses=1 --parallel-threads=10 -v tests/test_decompressor_fuzzing.py after installing the CI requirements.

…ed anyways

dpdani · 2024-11-23T22:22:41Z

I've added thread_unsafe annotations to all tests that use hypothesis.

I'm seeing a lot of warnings with the annotations, am I using it wrong?

PytestUnknownMarkWarning: Unknown pytest.mark.thread_unsafe - is this a typo?

I've also added a test for a compressor object that is shared between several threads, and it does cause a segmentation fault both in the free-threading and in the default build.

this mode of passing data to de/compression contexts for python-zstd does not seem to make sense anyways.
this library does support multithreaded compression, but it spawns its own pool of threads internally.
there's probably no reason for a Python program to feed data into a de/compressor from multiple threads.

ngoldbaum · 2024-11-23T22:34:32Z

You can make it thread-safe if you do something like this to add a per-decompressor lock: https://py-free-threading.github.io/porting/#dealing-with-thread-unsafe-libraries. Of course that won't scale well but as you said it's a weird thing to do.

Does the test segfault with the GIL too? I wouldn't be surprised if it does.

If it does, the fact that no one has reported this issue means it's not a big problem in practice and maybe you can just document the thread safety caveats?

I have a suspicion that python threading will spike in popularity soon as people adopt free-threading, so it was probably inevitable that someone would hit this eventually and in that case it probably is worth adding the locking.

ngoldbaum · 2024-11-23T22:35:52Z

Oh you said it's a bug in the default build I missed that. We've generally been trying to fix pre-existing thread safety issues that can be triggered in the default build if we can but we don't see them as blockers for shipping wheels.

ngoldbaum · 2024-11-23T22:37:40Z

Maybe @andfoy or @lysnikolaou know what's up with the warning.

indygreg · 2024-11-23T23:51:39Z

so any C-level thread safety issues that exist from sharing zstd contexts between threads are probably also present in the GIL-enabled build and no one has reported them yet.

The docs in api_usage.rst state our policy around API thread safety.

We need to ensure we don't crash if someone violates the "concurrent operations on multiple threads" rule. I'm fine with undefined behavior if someone attempts operations on the same zstd context from multiple threads. But I would prefer detecting and raising an error if this can be implemented with minimal runtime cost.

ngoldbaum · 2024-11-24T00:06:35Z

You could have an atomic flag that a thread sets when it acquires the context, if another thread tries to acquire a context with the flag set then that would be an error.

It's a little obnoxious to write cross-platform C code that uses atomics (see npy_atomic.h in NumPy for my basic attempt based on the private CPython atomics headers) but in rust or C++11 or newer it's pretty straightforward.

dpdani · 2024-11-24T12:07:13Z

one fairly easy cross-platform option is to just use pymutex behind some #ifdef Py_VERSION..., which would only affect some builds (noticeably the free-threading ones).

essentially, for Python >= 3.13 we could guarantee to throw an exception around concurrent use, and for prior versions we could retain the current behavior.
as pointed out by Gregory the documentation is pretty clear on this:

ZstdCompressor and ZstdDecompressor instances have no guarantees about thread safety. Do not operate on the same ZstdCompressor and ZstdDecompressor instance simultaneously from different threads. It is fine to have different threads call into a single instance, just not at the same time.

dpdani · 2024-11-24T12:08:11Z

not sure if I should do it in this PR or open a new one.

ngoldbaum · 2024-11-24T15:07:44Z

You can also use PyThread_type_lock on older python versions. It's sadly undocumented in CPython but you can take a look at what I did to make NumPy's use of lapack_lite thread safe to see how to conditionally use either depending on Python version: https://github.com/numpy/numpy/blob/main/numpy/linalg/umath_linalg.cpp. Grep for HAVE_EXTERNAL_LAPACK and LOCK_LAPACK_LITE.

The main difference wrt PyMutex is it's slower, supports a try lock API (which you probably don't need) and it requires a heap allocation.

dpdani · 2024-11-27T18:54:02Z

@ngoldbaum I've ended up opting for an atomic flag, partly using your npy_atomic.h. Thank you for sharing it! 🙏

@indygreg the modifications I just pushed make it so that when a ZstdCompressor is concurrently used, only one thread will succeed in using it.
All other threads get an ZstdError instead with a message pointing to the wiki.
Is this desirable? Should I apply this rule throughout the library?

I believe the performance impact is as little as it can possibly be.
The additional overhead is just one relaxed read to check if there's a thread using the ZstdCompressor object and one atomic write to mark it as used, if it isn't already.

ngoldbaum

Just a note that you're not doing a relaxed read - at least for the C11 atomics case you would need to use atomic_load_explicit. Right now as I understand it this is using SeqCst ordering for all operations. It's possible to write something that will likely be fast and scale better, but given that this implements a flag that triggers an error case on multithreaded access to shared resources, I doubt that matters.

ngoldbaum · 2024-12-02T17:01:44Z

c-ext/_pyzstd_atomics.h

I think in the long-term there should probably be some sort of C threading utility library that C extensions can use, including a full copy of the CPython atomics headers. Something that can be easily vendored as a submodule and ideally header-only. That said, that project doesn't yet exist so copying numpy's header for this purpose is probably the most practical solution for python-zstandard

nice catch! thanks 🙏

yeah, this is not the ideal solution. I hoped that CPython would expose them, but it seems unlikely it will.

a header-only library would be very nice, there may be other people interested in having it, out there in the C world.
maybe we could eventually push for CPython to create a separate C package and consume it itself, too.

ping @SonicField (author of ft_utils), we were talking about this last week.

didn't know about that library, looks very similar to my https://dpdani.github.io/cereggii/, maybe we could collaborate in the future!

ngoldbaum · 2025-01-28T18:29:57Z

@indygreg we'd appreciate it if you could give us your opinion on this approach and/or some code review.

It would be really helpful to be able to have cp313t wheels available. One particularly disruptive impact of python-zstandard failing to build on the free-threaded Python at the moment, is that pip install hatch does not work, so any project using hatch/hatchling can't easily support free-threaded Python. Even without wheels, a release with this PR applied would help enormously because the python-zstandard build would succeed on CI machines with developer tools installed.

There are probably things that could happen inside hatch to avoid this issue, but I think the "right" fix is for python-zstandard to help out a bit.

indygreg

Sorry it took so long to look at this. I've been exceptionally busy.

I suspect the level of support for free-threaded builds has improved since this PR was authored.

Please refresh and try to use the latest versions of things (with presumed FT compatibility) so we don't have to hack around missing support when 3.13 initially shipped.

Please split out the @pytest.mark.thread_unsafe annotations to their own PR along with adding the initial FT coverage to CI. I want to see the main branch running FT builds successfully, even if like 90% of tests are skipped.

Then we can revisit the meat of this change, which is adding the free-threaded detection to the C code.

Is that reasonable?

indygreg · 2025-02-09T20:18:04Z

.github/workflows/test.yml

@@ -59,47 +60,60 @@ jobs:
      PYTHONDEVMODE: '1'
    steps:
      - name: Set up Python
-        uses: actions/setup-python@v5
+        uses: Quansight-Labs/setup-python@v5


Why do we need to adopt a less official action? Does actions/setup-python not support the free-threaded builds?

Unfortunately, GitHub has been quite unresponsive to this: actions/setup-python#771

Though, there seems to be some recent activity: actions/setup-python#973

The fork is tracking upstream and has the open PR to add free-threading support applied.

You could also use setup-uv, which can be used as a drop-in replacement if you install pip into the uv environment.

indygreg · 2025-02-09T20:19:07Z

.github/workflows/test.yml

      # TODO enable once PyO3 supports 3.13.
      - name: Build (Rust)
-        if: matrix.arch == 'x64' && matrix.py != '3.13'
+        if: matrix.arch == 'x64' && !startsWith(matrix.py, '3.13')


I'm guessing this is supported now. But scope bloat to resolve it in this PR.

indygreg · 2025-02-09T20:20:05Z

.github/workflows/test.yml

      - name: Test CFFI Backend
+        if: "!startsWith(matrix.py, '3.13')"  # see pyproject.toml:4


Presumably this limitation no longer holds.

Unfortunately no, CFFI is one of the last low-level major dependencies without support. Recently the maintainers asked our team to work on a fork with free-threading support so they can review one big PR:

python-cffi/cffi#143 (comment)

dpdani · 2025-02-09T21:21:24Z

Hi Gregory 👋

Thanks for coming back to us!

Tomorrow I'll take a look at bumping numbers 👍

Please split out the @pytest.mark.thread_unsafe annotations to their own PR along with adding the initial FT coverage to CI. I want to see the main branch running FT builds successfully, even if like 90% of tests are skipped.

Do you mean you would like the annotations merged beforehand?
If we were to check the CI against a free-threading interpreter we would trigger the mechanism that turns the GIL back on at runtime, since the module doesn't set the Py_MOD_GIL_NOT_USED flag.
We would be effectively testing with the GIL twice. Is this what you intended?

[...] when a ZstdCompressor is concurrently used, only one thread will succeed in using it.
All other threads get an ZstdError instead with a message pointing to the wiki.

Do you have any thoughts on this?

indygreg · 2025-02-09T21:45:06Z

Actually, looking at this PR more and the prior discussion, I'm a bit confused what the purpose of the atomics code actually is.

Our existing docs say that ZstdCompressor and ZstdDecompressor instances aren't thread safe. It was already possible for customers to footgun themselves on GIL builds by calling into multiple methods simultaneously since the GIL would be released when calling into libzstd C code.

The atomics - if implemented globally (which this PR doesn't yet do) - could be a nice quality-of-life guard to catch code patterns that violate our thread safety guarantees. Is that the only problem it solves?

I initially/naively thought that free-threaded builds would require a fair amount of invasive code changes to prevent SEGFAULTs or similar crashes. But I think our existing API contract is essentially already free-threaded compatible since we're purposefully not multi-threaded safe at the ZstdCompressor and ZstdDecompressor level? If so and there is no new potential for crashes in free-threaded builds, I'm inclined to do the minimal thing possible to support free-threaded builds. We can then look into the lightweight atomic-based context guards as a quality-of-life improvement as a separate and follow-up PR.

ngoldbaum · 2025-02-09T23:05:52Z

Is that the only problem it solves?

Yes, exactly right.

I'm inclined to do the minimal thing possible to support free-threaded builds. We can then look into the lightweight atomic-based context guards as a quality-of-life improvement as a separate and follow-up PR.

That’s also the approach we took with NumPy - we didn’t initially put effort into fixing preexisting thread safety issues on the GIL-enabled build and focused instead on free-threading specific issues to get an initial release out.

dpdani · 2025-02-10T07:05:55Z

I'll remove the checks and the new test 👍

# Conflicts: # .github/workflows/test.yml # pyproject.toml

dpdani · 2025-02-10T18:04:37Z

I have reverted the changes regarding the runtime context-sharing checks and have merged main, but I won't have time to go through the version bumps today.

dpdani added 2 commits November 17, 2024 16:42

Add support for free-threading builds of CPython

ec7d96f

Merge branch 'main' into feature/3.13t

9a97d96

# Conflicts: # .github/workflows/test.yml # pyproject.toml

dpdani mentioned this pull request Nov 17, 2024

Compile fails on 3.14 no-GIL #231

Open

ngoldbaum mentioned this pull request Nov 19, 2024

Document that hypothesis is thread-unsafe Quansight-Labs/free-threaded-compatibility#116

Open

dpdani added 3 commits November 19, 2024 18:58

disable rust installation step for 3.13, because the tests are disabl…

538ec17

…ed anyways

mark hypothesis tests as thread_unsafe

27278be

add test_shared_compressor (segfaults)

711bd22

dpdani added 3 commits November 27, 2024 18:29

avoid io.BytesIO object

ddc5c7c

only allow one thread at a time to use ZstdCompressor

3e0ec59

fix test_shared_compressor

da3aec8

ngoldbaum reviewed Dec 2, 2024

View reviewed changes

fix memory ordering

57f26ea

ngoldbaum mentioned this pull request Jan 28, 2025

zstd.decompress not threadsafe for shared objects #244

Open

indygreg reviewed Feb 9, 2025

View reviewed changes

dpdani added 2 commits February 10, 2025 18:18

revert some changes

910d004

Merge branch 'main' into feature/3.13t

74dd766

# Conflicts: # .github/workflows/test.yml # pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for free-threading builds of CPython #243

Add support for free-threading builds of CPython #243

dpdani commented Nov 17, 2024

ngoldbaum commented Nov 19, 2024

ngoldbaum commented Nov 19, 2024 •

edited

Loading

dpdani commented Nov 19, 2024

ngoldbaum commented Nov 19, 2024

dpdani commented Nov 23, 2024

ngoldbaum commented Nov 23, 2024 •

edited

Loading

ngoldbaum commented Nov 23, 2024 •

edited

Loading

ngoldbaum commented Nov 23, 2024

indygreg commented Nov 23, 2024

ngoldbaum commented Nov 24, 2024

dpdani commented Nov 24, 2024

dpdani commented Nov 24, 2024

ngoldbaum commented Nov 24, 2024 •

edited

Loading

dpdani commented Nov 27, 2024

ngoldbaum left a comment

ngoldbaum Dec 2, 2024 •

edited

Loading

dpdani Dec 2, 2024

ngoldbaum Dec 2, 2024

dpdani Dec 2, 2024

ngoldbaum commented Jan 28, 2025

indygreg left a comment

indygreg Feb 9, 2025

dpdani Feb 9, 2025

ngoldbaum Feb 9, 2025

indygreg Feb 9, 2025

indygreg Feb 9, 2025

ngoldbaum Feb 9, 2025

dpdani commented Feb 9, 2025

indygreg commented Feb 9, 2025

ngoldbaum commented Feb 9, 2025 •

edited

Loading

dpdani commented Feb 10, 2025

dpdani commented Feb 10, 2025

		- name: Test CFFI Backend
		if: "!startsWith(matrix.py, '3.13')" # see pyproject.toml:4

Add support for free-threading builds of CPython #243

Are you sure you want to change the base?

Add support for free-threading builds of CPython #243

Conversation

dpdani commented Nov 17, 2024

ngoldbaum commented Nov 19, 2024

ngoldbaum commented Nov 19, 2024 • edited Loading

dpdani commented Nov 19, 2024

ngoldbaum commented Nov 19, 2024

dpdani commented Nov 23, 2024

ngoldbaum commented Nov 23, 2024 • edited Loading

ngoldbaum commented Nov 23, 2024 • edited Loading

ngoldbaum commented Nov 23, 2024

indygreg commented Nov 23, 2024

ngoldbaum commented Nov 24, 2024

dpdani commented Nov 24, 2024

dpdani commented Nov 24, 2024

ngoldbaum commented Nov 24, 2024 • edited Loading

dpdani commented Nov 27, 2024

ngoldbaum left a comment

Choose a reason for hiding this comment

ngoldbaum Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ngoldbaum commented Jan 28, 2025

indygreg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dpdani commented Feb 9, 2025

indygreg commented Feb 9, 2025

ngoldbaum commented Feb 9, 2025 • edited Loading

dpdani commented Feb 10, 2025

dpdani commented Feb 10, 2025

ngoldbaum commented Nov 19, 2024 •

edited

Loading

ngoldbaum commented Nov 23, 2024 •

edited

Loading

ngoldbaum commented Nov 23, 2024 •

edited

Loading

ngoldbaum commented Nov 24, 2024 •

edited

Loading

ngoldbaum Dec 2, 2024 •

edited

Loading

ngoldbaum commented Feb 9, 2025 •

edited

Loading