Added free-threading CPython mode support in MLIR Python bindings #107103

vfdev-5 · 2024-09-03T13:06:13Z

Related to #105522

Description:

This PR is a joint work with Peter Hawkins (@hawkinsp) originally done by myself for pybind11 and then reworked to nanobind based on Peter's branch: https://github.com/hawkinsp/llvm-project/tree/nbdev .

Added free-threading CPython mode support for MLIR Python bindings
Added a test which can reveal data races when cpython and LLVM/MLIR compiled with TSAN

Context:

Related to Add support nogil mode in JAX for Python 3.13 jax-ml/jax#23073

github-actions · 2024-09-03T13:06:33Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2024-09-18T13:02:23Z

@llvm/pr-subscribers-mlir

Author: vfdev (vfdev-5)

Changes

Related to #105522

Description:

Added free-threading CPython mode support in MLIR Python bindings via pybind11
Updated python requirements

Context:

Related to Add support nogil mode in JAX for Python 3.13 jax-ml/jax#23073

Full diff: https://github.com/llvm/llvm-project/pull/107103.diff

17 Files Affected:

(modified) mlir/lib/Bindings/Python/AsyncPasses.cpp (+3-1)
(modified) mlir/lib/Bindings/Python/DialectGPU.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectLLVM.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectLinalg.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectNVGPU.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectPDL.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectQuant.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectSparseTensor.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/DialectTransform.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/ExecutionEngineModule.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/GPUPasses.cpp (+3-1)
(modified) mlir/lib/Bindings/Python/LinalgPasses.cpp (+3-1)
(modified) mlir/lib/Bindings/Python/MainModule.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/RegisterEverything.cpp (+1-1)
(modified) mlir/lib/Bindings/Python/SparseTensorPasses.cpp (+3-1)
(modified) mlir/lib/Bindings/Python/TransformInterpreter.cpp (+1-1)
(modified) mlir/python/requirements.txt (+5-3)

diff --git a/mlir/lib/Bindings/Python/AsyncPasses.cpp b/mlir/lib/Bindings/Python/AsyncPasses.cpp
index b611a758dbbb37..d34a164b6e30c2 100644
--- a/mlir/lib/Bindings/Python/AsyncPasses.cpp
+++ b/mlir/lib/Bindings/Python/AsyncPasses.cpp
@@ -11,11 +11,13 @@
 #include <pybind11/detail/common.h>
 #include <pybind11/pybind11.h>
 
+namespace py = pybind11;
+
 // -----------------------------------------------------------------------------
 // Module initialization.
 // -----------------------------------------------------------------------------
 
-PYBIND11_MODULE(_mlirAsyncPasses, m) {
+PYBIND11_MODULE(_mlirAsyncPasses, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Async Dialect Passes";
 
   // Register all Async passes on load.
diff --git a/mlir/lib/Bindings/Python/DialectGPU.cpp b/mlir/lib/Bindings/Python/DialectGPU.cpp
index 560a54bcd15919..5acfad007c3055 100644
--- a/mlir/lib/Bindings/Python/DialectGPU.cpp
+++ b/mlir/lib/Bindings/Python/DialectGPU.cpp
@@ -23,7 +23,7 @@ using namespace mlir::python::adaptors;
 // Module initialization.
 // -----------------------------------------------------------------------------
 
-PYBIND11_MODULE(_mlirDialectsGPU, m) {
+PYBIND11_MODULE(_mlirDialectsGPU, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR GPU Dialect";
   //===-------------------------------------------------------------------===//
   // AsyncTokenType
diff --git a/mlir/lib/Bindings/Python/DialectLLVM.cpp b/mlir/lib/Bindings/Python/DialectLLVM.cpp
index 42a4c8c0793ba8..2af133a061eb4b 100644
--- a/mlir/lib/Bindings/Python/DialectLLVM.cpp
+++ b/mlir/lib/Bindings/Python/DialectLLVM.cpp
@@ -134,7 +134,7 @@ void populateDialectLLVMSubmodule(const pybind11::module &m) {
       });
 }
 
-PYBIND11_MODULE(_mlirDialectsLLVM, m) {
+PYBIND11_MODULE(_mlirDialectsLLVM, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR LLVM Dialect";
 
   populateDialectLLVMSubmodule(m);
diff --git a/mlir/lib/Bindings/Python/DialectLinalg.cpp b/mlir/lib/Bindings/Python/DialectLinalg.cpp
index 2e54ebeb61fb10..118c4ab3f2f573 100644
--- a/mlir/lib/Bindings/Python/DialectLinalg.cpp
+++ b/mlir/lib/Bindings/Python/DialectLinalg.cpp
@@ -21,7 +21,7 @@ static void populateDialectLinalgSubmodule(py::module m) {
       "op.");
 }
 
-PYBIND11_MODULE(_mlirDialectsLinalg, m) {
+PYBIND11_MODULE(_mlirDialectsLinalg, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Linalg dialect.";
 
   populateDialectLinalgSubmodule(m);
diff --git a/mlir/lib/Bindings/Python/DialectNVGPU.cpp b/mlir/lib/Bindings/Python/DialectNVGPU.cpp
index 754e0a75b0abc7..4c962c403082cb 100644
--- a/mlir/lib/Bindings/Python/DialectNVGPU.cpp
+++ b/mlir/lib/Bindings/Python/DialectNVGPU.cpp
@@ -34,7 +34,7 @@ static void populateDialectNVGPUSubmodule(const pybind11::module &m) {
       py::arg("ctx") = py::none());
 }
 
-PYBIND11_MODULE(_mlirDialectsNVGPU, m) {
+PYBIND11_MODULE(_mlirDialectsNVGPU, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR NVGPU dialect.";
 
   populateDialectNVGPUSubmodule(m);
diff --git a/mlir/lib/Bindings/Python/DialectPDL.cpp b/mlir/lib/Bindings/Python/DialectPDL.cpp
index 8d3f9a7ab1d6ac..e8542d5e777a65 100644
--- a/mlir/lib/Bindings/Python/DialectPDL.cpp
+++ b/mlir/lib/Bindings/Python/DialectPDL.cpp
@@ -100,7 +100,7 @@ void populateDialectPDLSubmodule(const pybind11::module &m) {
       py::arg("context") = py::none());
 }
 
-PYBIND11_MODULE(_mlirDialectsPDL, m) {
+PYBIND11_MODULE(_mlirDialectsPDL, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR PDL dialect.";
   populateDialectPDLSubmodule(m);
 }
diff --git a/mlir/lib/Bindings/Python/DialectQuant.cpp b/mlir/lib/Bindings/Python/DialectQuant.cpp
index af9cdc7bdd2d89..fc6ef9f46ce8e5 100644
--- a/mlir/lib/Bindings/Python/DialectQuant.cpp
+++ b/mlir/lib/Bindings/Python/DialectQuant.cpp
@@ -307,7 +307,7 @@ static void populateDialectQuantSubmodule(const py::module &m) {
   });
 }
 
-PYBIND11_MODULE(_mlirDialectsQuant, m) {
+PYBIND11_MODULE(_mlirDialectsQuant, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Quantization dialect";
 
   populateDialectQuantSubmodule(m);
diff --git a/mlir/lib/Bindings/Python/DialectSparseTensor.cpp b/mlir/lib/Bindings/Python/DialectSparseTensor.cpp
index 584981cfe99bf1..d4f35859fdcf1a 100644
--- a/mlir/lib/Bindings/Python/DialectSparseTensor.cpp
+++ b/mlir/lib/Bindings/Python/DialectSparseTensor.cpp
@@ -142,7 +142,7 @@ static void populateDialectSparseTensorSubmodule(const py::module &m) {
       });
 }
 
-PYBIND11_MODULE(_mlirDialectsSparseTensor, m) {
+PYBIND11_MODULE(_mlirDialectsSparseTensor, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR SparseTensor dialect.";
   populateDialectSparseTensorSubmodule(m);
 }
diff --git a/mlir/lib/Bindings/Python/DialectTransform.cpp b/mlir/lib/Bindings/Python/DialectTransform.cpp
index 6b57e652aa9d8b..df665dd66bdc28 100644
--- a/mlir/lib/Bindings/Python/DialectTransform.cpp
+++ b/mlir/lib/Bindings/Python/DialectTransform.cpp
@@ -117,7 +117,7 @@ void populateDialectTransformSubmodule(const pybind11::module &m) {
       "Get the type this ParamType is associated with.");
 }
 
-PYBIND11_MODULE(_mlirDialectsTransform, m) {
+PYBIND11_MODULE(_mlirDialectsTransform, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Transform dialect.";
   populateDialectTransformSubmodule(m);
 }
diff --git a/mlir/lib/Bindings/Python/ExecutionEngineModule.cpp b/mlir/lib/Bindings/Python/ExecutionEngineModule.cpp
index b3df30583fc963..ddd81d1e7d592e 100644
--- a/mlir/lib/Bindings/Python/ExecutionEngineModule.cpp
+++ b/mlir/lib/Bindings/Python/ExecutionEngineModule.cpp
@@ -64,7 +64,7 @@ class PyExecutionEngine {
 } // namespace
 
 /// Create the `mlir.execution_engine` module here.
-PYBIND11_MODULE(_mlirExecutionEngine, m) {
+PYBIND11_MODULE(_mlirExecutionEngine, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Execution Engine";
 
   //----------------------------------------------------------------------------
diff --git a/mlir/lib/Bindings/Python/GPUPasses.cpp b/mlir/lib/Bindings/Python/GPUPasses.cpp
index e276a3ce3a56a0..bfc43e99946bb9 100644
--- a/mlir/lib/Bindings/Python/GPUPasses.cpp
+++ b/mlir/lib/Bindings/Python/GPUPasses.cpp
@@ -11,11 +11,13 @@
 #include <pybind11/detail/common.h>
 #include <pybind11/pybind11.h>
 
+namespace py = pybind11;
+
 // -----------------------------------------------------------------------------
 // Module initialization.
 // -----------------------------------------------------------------------------
 
-PYBIND11_MODULE(_mlirGPUPasses, m) {
+PYBIND11_MODULE(_mlirGPUPasses, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR GPU Dialect Passes";
 
   // Register all GPU passes on load.
diff --git a/mlir/lib/Bindings/Python/LinalgPasses.cpp b/mlir/lib/Bindings/Python/LinalgPasses.cpp
index 3f230207a42114..e3d8f237e2bab3 100644
--- a/mlir/lib/Bindings/Python/LinalgPasses.cpp
+++ b/mlir/lib/Bindings/Python/LinalgPasses.cpp
@@ -10,11 +10,13 @@
 
 #include <pybind11/pybind11.h>
 
+namespace py = pybind11;
+
 // -----------------------------------------------------------------------------
 // Module initialization.
 // -----------------------------------------------------------------------------
 
-PYBIND11_MODULE(_mlirLinalgPasses, m) {
+PYBIND11_MODULE(_mlirLinalgPasses, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Linalg Dialect Passes";
 
   // Register all Linalg passes on load.
diff --git a/mlir/lib/Bindings/Python/MainModule.cpp b/mlir/lib/Bindings/Python/MainModule.cpp
index 8da1ab16a4514b..de713e7031a01e 100644
--- a/mlir/lib/Bindings/Python/MainModule.cpp
+++ b/mlir/lib/Bindings/Python/MainModule.cpp
@@ -22,7 +22,7 @@ using namespace mlir::python;
 // Module initialization.
 // -----------------------------------------------------------------------------
 
-PYBIND11_MODULE(_mlir, m) {
+PYBIND11_MODULE(_mlir, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Python Native Extension";
 
   py::class_<PyGlobals>(m, "_Globals", py::module_local())
diff --git a/mlir/lib/Bindings/Python/RegisterEverything.cpp b/mlir/lib/Bindings/Python/RegisterEverything.cpp
index 6b2f6b0a6a3b86..5c5c6e32102712 100644
--- a/mlir/lib/Bindings/Python/RegisterEverything.cpp
+++ b/mlir/lib/Bindings/Python/RegisterEverything.cpp
@@ -9,7 +9,7 @@
 #include "mlir-c/RegisterEverything.h"
 #include "mlir/Bindings/Python/PybindAdaptors.h"
 
-PYBIND11_MODULE(_mlirRegisterEverything, m) {
+PYBIND11_MODULE(_mlirRegisterEverything, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR All Upstream Dialects, Translations and Passes Registration";
 
   m.def("register_dialects", [](MlirDialectRegistry registry) {
diff --git a/mlir/lib/Bindings/Python/SparseTensorPasses.cpp b/mlir/lib/Bindings/Python/SparseTensorPasses.cpp
index 2a8e2b802df9c4..1bbdf2f3ccf4e5 100644
--- a/mlir/lib/Bindings/Python/SparseTensorPasses.cpp
+++ b/mlir/lib/Bindings/Python/SparseTensorPasses.cpp
@@ -10,11 +10,13 @@
 
 #include <pybind11/pybind11.h>
 
+namespace py = pybind11;
+
 // -----------------------------------------------------------------------------
 // Module initialization.
 // -----------------------------------------------------------------------------
 
-PYBIND11_MODULE(_mlirSparseTensorPasses, m) {
+PYBIND11_MODULE(_mlirSparseTensorPasses, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR SparseTensor Dialect Passes";
 
   // Register all SparseTensor passes on load.
diff --git a/mlir/lib/Bindings/Python/TransformInterpreter.cpp b/mlir/lib/Bindings/Python/TransformInterpreter.cpp
index f6b4532b1b6be4..93ab447d52bec1 100644
--- a/mlir/lib/Bindings/Python/TransformInterpreter.cpp
+++ b/mlir/lib/Bindings/Python/TransformInterpreter.cpp
@@ -99,7 +99,7 @@ static void populateTransformInterpreterSubmodule(py::module &m) {
       py::arg("target"), py::arg("other"));
 }
 
-PYBIND11_MODULE(_mlirTransformInterpreter, m) {
+PYBIND11_MODULE(_mlirTransformInterpreter, m, py::mod_gil_not_used()) {
   m.doc() = "MLIR Transform dialect interpreter functionality.";
   populateTransformInterpreterSubmodule(m);
 }
diff --git a/mlir/python/requirements.txt b/mlir/python/requirements.txt
index d1b5418cca5b23..49b8471c6b771c 100644
--- a/mlir/python/requirements.txt
+++ b/mlir/python/requirements.txt
@@ -1,4 +1,6 @@
-numpy>=1.19.5, <=1.26
-pybind11>=2.9.0, <=2.10.3
+numpy>=1.19.5, <3.0
+# pybind11>=2.14.0, <2.15.0
+# Temporarily set pybind11 version to master waiting the next release to 2.13.6
+pybind11 @ git+https://github.com/pybind/pybind11@master
 PyYAML>=5.4.0, <=6.0.1
-ml_dtypes>=0.1.0, <=0.4.0   # provides several NumPy dtype extensions, including the bf16
+ml_dtypes>=0.5.0, <=0.6.0   # provides several NumPy dtype extensions, including the bf16

makslevental · 2024-09-18T23:02:05Z

I hit "run tests" but I don't think this is going to work - there are places where we do assume GIL. I don't remember exactly (I believe it's in some of the custom stuff in pybind_adaptors). I can take a closer look soon.

vfdev-5 · 2024-09-19T13:40:58Z

@makslevental thanks for the feedback. At the moment, there is no CI job using python 3.13.0rc1 with free-threading enabled, we may want to add a CI job for that...

I run existing python tests on python 3.13 with free-threading and all of them are passing using pybind11 from source (otherwise, there is a deadlock (pybind/pybind11#5346)).

I don't remember exactly (I believe it's in some of the custom stuff in pybind_adaptors). I can take a closer look soon.

I'll check from my side as well.

Here are few docs on free-threading support in case:

vfdev-5 · 2024-09-22T19:53:06Z

Set this PR as draft as found non-protected for multi-threading C++ code using liveContexts in PyMlirContext which is failing in my local multi-threaded tests.

stellaraccident · 2024-09-22T20:34:16Z

Thanks for this. I've been burning through free threading deps in other libraries but hadn't come to this one yet.

Given that this is all very bleeding edge, it may be a little bit before we have proper CI for it upstream, but I'm happy to accept patches that improve free threaded compatibility in the meantime, especially if you have a test rig.

stellaraccident · 2024-09-22T20:38:26Z

Maks is right: three are a couple of places where we use mutable global state and we should come up with a convention to protect those with a global mutex. I know there is an idiom for this in CPython itself, but is there a common thing done for pybind/extensions yet? (Basically where it no ops in gil builds but uses a scoped mutex in free threaded builds)

Let's verify that before marking the modules free threaded compatible.

vfdev-5 · 2024-09-22T20:45:49Z

@stellaraccident thanks, i have few questions though about the current python bindings tests using lit.

My local tests are rather simple and I would like to improve that:

import typer

import threading
import concurrent.futures

import mlir.dialects.arith as arith
from mlir.ir import Context, Location, Module, IntegerType, F64Type, InsertionPoint


def mt_run(fn, num_threads, args=(), kwargs={}):
    barrier = threading.Barrier(num_threads)

    def closure():
        barrier.wait()
        return fn(*args, **kwargs)

    with concurrent.futures.ThreadPoolExecutor(
        max_workers=num_threads
    ) as executor:
        futures = []
        for _ in range(num_threads):
            futures.append(executor.submit(closure))
        # We should call future.result() to re-raise an exception if test has
        # failed
        return list(f.result() for f in futures)


def func():
    py_values = [123, 234, 345]
    with Context() as ctx:
        module = Module.create(loc=Location.file("foo.txt", 0, 0))

        dtype = IntegerType.get_signless(64)
        with InsertionPoint(module.body), Location.name("a"):
        # with Location.name("a"):
            arith.constant(dtype, py_values[0])

        with InsertionPoint(module.body), Location.name("b"):
        # with Location.name("b"):
            arith.constant(dtype, py_values[1])

        with InsertionPoint(module.body), Location.name("c"):
        # with Location.name("c"):
            arith.constant(dtype, py_values[2])

    return str(module)


def func2():
    py_values = [123, 234, 345]
    with Context() as ctx, Location.file("foo.txt", 0, 0):
        module = Module.create()
        with InsertionPoint(module.body):
            dtype = IntegerType.get_signless(64)
            arith.constant(dtype, py_values[0])

    return str(module)

def test(func, num_threads=10, expected_first = False):

    if expected_first:
        expected_mlir = func()
        print("Expected MLIR:", expected_mlir)

    output_mlir_list = mt_run(func, num_threads=num_threads)

    if not expected_first:
        expected_mlir = func()
        print("Expected MLIR:", expected_mlir)

    for i, output_mlir in enumerate(output_mlir_list):
        assert output_mlir == expected_mlir, (i, output_mlir, expected_mlir)


def main(
    n: int = 100,
    name: str = "test",
    nt: int = 10,
    ef: bool = False,
):
    test_fn = {
        "test": func,
        "test2": func2,
    }[name]
    for i in range(n):
        print("- Count: ", i)
        test(test_fn, num_threads=nt, expected_first=ef)


if __name__ == "__main__":
    typer.run(main)

Ideally, making existing tests to run in a multi-threaded execution (either providing a manual implementation or using tools like: https://github.com/Quansight-Labs/pytest-run-parallel).
Seems like lit is running tests and using stdout output checks which may not always work correctly with multi-treaded execution... Do you think we could reuse somehow exising tests with lit in multi-threaded execution?

we should come up with a convention to protect those with a global mutex. I know there is an idiom for this in CPython itself, but is there a common thing done for pybind/extensions yet?

Yes, there is an example in pybind11 for that:
https://github.com/pybind/pybind11/blob/1f8b4a7f1a1c5cc9bd6e0d63fe15540e6c458b24/include/pybind11/detail/internals.h#L645-L649

I applied a similar thing for getLiveContexts (locally):

#ifdef Py_GIL_DISABLED
  static PyMutex &getLock() {
    static PyMutex lock;
    return lock;
  }
#endif

  template<typename F>
  static inline auto withLiveContexts(const F& cb) -> decltype(cb(getLiveContexts())) {
    auto &liveContexts = getLiveContexts();
#ifdef Py_GIL_DISABLED
    auto &lock = getLock();
    PyMutex_Lock(&lock);
#endif
    auto result = cb(liveContexts);
#ifdef Py_GIL_DISABLED
    PyMutex_Unlock(&lock);
#endif
    return result;
  }

Usage:

PyMlirContext::PyMlirContext(MlirContext context) : context(context) {
  py::gil_scoped_acquire acquire;

  withLiveContexts([&](LiveContextMap& liveContexts) {
    liveContexts[context.ptr] = this;
    return this;
  });

}

stellaraccident · 2024-09-22T20:58:10Z

That locking idiom looks good to me. Thanks.

For the tests, I won't claim to love the current testing style, but as with most such things, it is costly to change. If this weren't a part of llvm, I never would have written the tests this way, but I digress...

I suppose we should discuss what level of threading compatibility we are targeting (and likely document that). It seems where this patch is going is ensuring that the library itself does not provide any inherent locking hazards if independent contexts are used from multiple threads. Beyond that, the thread safety is approximately what it is for the underlying mlir libraries, which is to say that it is unsafe to perform mutation on any IR structure that you are unsure as to whether it is being exclusively access by a single thread. Most non framework code will just assume thread per context as the safe way to use the API.

That is quite a bit less safe than the CPython itself, which guards mutable data structures (list, shared dict, etc) internally, making it possible to have logical consistency problems is used across threads in an unsafe way, but internally guarding against corruption of the data structures themselves.

I'm not aware of a consensus that has emerged on the level of safety that a library such as this should provide. Fine grained locking at the binding layer seems like it would be expensive and error prone, especially given the more relaxed stance of the underlying library.

Thoughts?

(The reason this applies to testing is that if we are doing course locking of globals only, then we can likely target some specific tests that verify that. But if doing fine grained locking of IR structures, we probably need to rewrite the test suite into something that can be better verified with parallel testing)

vfdev-5 · 2024-09-23T16:06:50Z

Thanks for the reply @stellaraccident !

Yes, we should discuss what level of threading compatibility we are targeting.

It seems where this patch is going is ensuring that the library itself does not provide any inherent locking hazards if independent contexts are used from multiple threads.

IMO the idea behind enabling CPython free-threading mode in python extensions (like MLIR python bindings) is to make them thread-safe as they are with GIL-enabled CPython.

I'm not aware of a consensus that has emerged on the level of safety that a library such as this should provide. Fine grained locking at the binding layer seems like it would be expensive and error prone, especially given the more relaxed stance of the underlying library.

Can you please detail this statement? I was thinking that we wanted to make python bindings thread-safe for free-threading cpython...

Few links on thread safety discussion within free-threading:

https://discuss.python.org/t/free-threading-trove-classifier/62406/23

stellaraccident · 2024-09-23T18:12:40Z

Can you please detail this statement? I was thinking that we wanted to make python bindings thread-safe for free-threading cpython...

If you want to make them thread-safe, then we will need to add much more locking, likely choosing to use context-scoped locks. The issue is that the underlying MLIR library is thread-compatible but not thread-safe. The GIL is currently making usage from Python more thread safe than MLIR actually is. Concretely, think of two threads attempting to modify the operations in the same function as an example of what would happen without more binding-level locking.

Given the bleeding edge nature of free-threaded Python, I don't think we need to get to thread-safe in one step, but we should document and decide on where we are trying to take things. Then we could document a big warning saying that the library is not presently fully thread safe in [list ways].

I'm supportive of making incremental progress in the right direction, so long as we document where we are trying to get to and are clear with folks what the current safety level is.

vfdev-5 · 2024-09-24T13:27:13Z

@stellaraccident thanks for the discussion, I agree with your suggestion about documenting the scope of what free-threading would enable.

Concerning this part:

If you want to make them thread-safe, then we will need to add much more locking, likely choosing to use context-scoped locks. The issue is that the underlying MLIR library is thread-compatible but not thread-safe. The GIL is currently making usage from Python more thread safe than MLIR actually is. Concretely, think of two threads attempting to modify the operations in the same function as an example of what would happen without more binding-level locking.

I would like still to highlight the point that free-threading aims:

to retains the thread safety guarantees that were in place before the GIL was removed, and it doesn’t either add nor remove any. So that, for instance, my_int += 1 is unsafe both in the free-threaded and in the default build.
The fact that there is no GIL doesn’t mean that everything must now be thread safe.

(source: https://discuss.python.org/t/free-threading-trove-classifier/62406/24)

A similar situation is now with Numpy which is thread-unsafe to use from both the GIL-enabled and GIL-disabled build and user needs to be careful not to mutate shared state (cc @ngoldbaum).

I'm supportive of making incremental progress in the right direction, so long as we document where we are trying to get to and are clear with folks what the current safety level is.

Sounds good, so let's to make MLIR Python bindings tread-safe in free threading mode and document it.

stellaraccident · 2024-09-24T13:40:02Z

Yes, but there is an important distinction: my_int += 1 will not crash in either build. Mutating shared IR state from multiple threads with the gil would produce inconsistent/logical errors but would not crash. Without the gil and without some form of extra locking, it will crash as the underlying IR structures are thread compatible but not thread safe.

If the cost of fine grain locking is excessive, many projects (mine included) would probably opt to build the extension without locking, since they always use it in thread per context anyway and the IR building is not a user exposed thing.

hawkinsp · 2024-09-24T16:07:08Z

I'll also +1 that adding fine-grain locking may be problematic. The MLIR Python builders are already slow, and adding lots of fine grained locking, even uncontended locking, will make that worse. But in JAX, as in Stella's case, we never access the same IR objects from multiple threads, and if we did need to do that we could add client-side locking. I think that's by far the most important mode to support.

stellaraccident · 2024-09-24T16:15:47Z

I'll also +1 that adding fine-grain locking may be problematic. The MLIR Python builders are already slow, and adding lots of fine grained locking, even uncontended locking, will make that worse. But in JAX, as in Stella's case, we never access the same IR objects from multiple threads, and if we did need to do that we could add client-side locking. I think that's by far the most important mode to support.

Off topic, but one of these days, I'm going to have to create a stripped down, build only (probably nanobind based) fast version of the API. Not at the top of the list right now, but yeah... slow. On the torch side, we only use the non-ODS API so that we have more flexibility here in the future.

hawkinsp · 2024-09-24T16:20:15Z

Off topic, but one of these days, I'm going to have to create a stripped down, build only (probably nanobind based) fast version of the API. Not at the top of the list right now, but yeah... slow. On the torch side, we only use the non-ODS API so that we have more flexibility here in the future.

Yeah, at some point soon I'm either going to end up trying to send PRs to speed up the existing bindings (switch to nanobind?) or I'll migrate off them and do my own thing.

stellaraccident · 2024-09-24T16:31:30Z

Off topic, but one of these days, I'm going to have to create a stripped down, build only (probably nanobind based) fast version of the API. Not at the top of the list right now, but yeah... slow. On the torch side, we only use the non-ODS API so that we have more flexibility here in the future.

Yeah, at some point soon I'm either going to end up trying to send PRs to speed up the existing bindings (switch to nanobind?) or I'll migrate off them and do my own thing.

I haven't benchmarked it in detail, but there are quite a bit of "productivity features" in there. I don't think for fastest-path builder use, the general IR manipulation APIs are going to be super easy to upgrade -- and they do get use. If the whole thing was ported to nanobind, I'd feel better about contributing a build-only fast-path subset library that could live alongside. In the current state, that would go out of tree because if going to the trouble to write something fast, I'm not going to base it on pybind.

mlir/test/python/multithreaded_tests.py

mlir/lib/Bindings/Python/ExecutionEngineModule.cpp

github-actions · 2025-01-08T10:57:07Z

✅ With the latest revision this PR passed the Python code formatter.

github-actions · 2025-01-08T10:57:07Z

✅ With the latest revision this PR passed the C/C++ code formatter.

hawkinsp · 2025-01-08T13:01:03Z

To me, this looks ready for a review from one of the MLIR Python owners.

mlir/test/python/multithreaded_tests.py

WIP freethreading Added free-threading CPython mode support in Python bindings - temporarily updated requirements Added lock on PyGlobals::get and PyMlirContext liveContexts WIP on adding multithreaded_tests More tests and added a lock to _cext.register_operation Updated tests + labelled passing and failing tests Updated locks and added docs Docs updates, tests and fixed cmake config issue Fixed nanobind target in mlir/cmake/modules/AddMLIRPython.cmake Reverted mlir/test/python/lib/PythonTestModulePybind11.cpp Removed old data-races stacks and removed fixed tests Revert some python deps versions Recoded mlir/test/python/multithreaded_tests.py without pytest Added a condition to skip running mlir/test/python/multithreaded_tests.py when has GIL Addressed some of the PR review comments Updated Python docs and mlir/test/python/multithreaded_tests.py Updated mlir/test/python/multithreaded_tests.py due to removed lock in ExecutionEngine

jpienaar

LG, thanks! The test section could have been good with less manual lists - but that's probably as much on us currently in how tests are written than anything else. Also this increases testing/comfort in free-threading a lot.

github-actions · 2025-01-12T17:57:07Z

@vfdev-5 Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

llvm-ci · 2025-01-12T18:02:03Z

LLVM Buildbot has detected a new failure on builder mlir-rocm-mi200 running on mi200-buildbot while building mlir at step 3 "install-mlir-requirements".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/177/builds/11130

Here is the relevant piece of the build log for the reference

Step 3 (install-mlir-requirements) failure: 'pip install ...' (failure)
ERROR: Ignored the following versions that require a different python version: 0.3.0 Requires-Python >=3.9; 0.3.1 Requires-Python >=3.9; 0.3.2 Requires-Python >=3.9; 0.4.0 Requires-Python >=3.9; 0.4.0b1 Requires-Python >=3.9; 0.4.1 Requires-Python >=3.9; 0.5.0 Requires-Python >=3.9; 0.5.1 Requires-Python >=3.9
ERROR: Could not find a version that satisfies the requirement ml_dtypes<=0.6.0,>=0.5.0 (from versions: 0.0.1, 0.0.2rc4, 0.0.2rc5, 0.0.2, 0.0.3, 0.0.4, 0.1.0, 0.2.0)

[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python3 -m pip install --upgrade pip
ERROR: No matching distribution found for ml_dtypes<=0.6.0,>=0.5.0
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: python/multithreaded_tests.py' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/usr/bin/python3.8 /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/python/multithreaded_tests.py
# executed command: /usr/bin/python3.8 /vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/python/multithreaded_tests.py
# .---command stderr------------
# | Traceback (most recent call last):
# |   File "/vol/worker/mi200-buildbot/mlir-rocm-mi200/llvm-project/mlir/test/python/multithreaded_tests.py", line 429, in <module>
# |     skip_tests: Optional[list[str]] = None,
# | TypeError: 'type' object is not subscriptable
# `-----------------------------
# error: command failed with exit status: 1

--

********************

…ings (#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

@hawkinsp

…vm#107103) Related to llvm#105522 Description: This PR is a joint work with Peter Hawkins (@hawkinsp) originally done by myself for pybind11 and then reworked to nanobind based on Peter's branch: https://github.com/hawkinsp/llvm-project/tree/nbdev . - Added free-threading CPython mode support for MLIR Python bindings - Added a test which can reveal data races when cpython and LLVM/MLIR compiled with TSAN Context: - Related to jax-ml/jax#23073 Co-authored-by: Peter Hawkins <phawkins@google.com>

…ings (llvm#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

@hawkinsp

…vm#107103) Related to llvm#105522 Description: This PR is a joint work with Peter Hawkins (@hawkinsp) originally done by myself for pybind11 and then reworked to nanobind based on Peter's branch: https://github.com/hawkinsp/llvm-project/tree/nbdev . - Added free-threading CPython mode support for MLIR Python bindings - Added a test which can reveal data races when cpython and LLVM/MLIR compiled with TSAN Context: - Related to jax-ml/jax#23073 Co-authored-by: Peter Hawkins <phawkins@google.com>

…ings (llvm#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

@jpienaar

Reland reverted #107103 with the fixes for Python 3.8 cc @jpienaar Co-authored-by: Peter Hawkins <phawkins@google.com>

@jpienaar

…2684) Reland reverted llvm/llvm-project#107103 with the fixes for Python 3.8 cc @jpienaar Co-authored-by: Peter Hawkins <phawkins@google.com>

@jpienaar

Reland reverted llvm#107103 with the fixes for Python 3.8 cc @jpienaar Co-authored-by: Peter Hawkins <phawkins@google.com>

@hawkinsp

…vm#107103) Related to llvm#105522 Description: This PR is a joint work with Peter Hawkins (@hawkinsp) originally done by myself for pybind11 and then reworked to nanobind based on Peter's branch: https://github.com/hawkinsp/llvm-project/tree/nbdev . - Added free-threading CPython mode support for MLIR Python bindings - Added a test which can reveal data races when cpython and LLVM/MLIR compiled with TSAN Context: - Related to jax-ml/jax#23073 Co-authored-by: Peter Hawkins <phawkins@google.com>

…ings (llvm#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

@jpienaar

Reland reverted llvm#107103 with the fixes for Python 3.8 cc @jpienaar Co-authored-by: Peter Hawkins <phawkins@google.com>

vfdev-5 changed the title ~~[WIP] Added free-threading CPython mode support in Python bindings~~ [WIP] Added free-threading CPython mode support in MLIR Python bindings Sep 3, 2024

vfdev-5 force-pushed the free-threading-support branch from 57e7293 to ccb4cba Compare September 3, 2024 16:06

vfdev-5 force-pushed the free-threading-support branch from ccb4cba to 4496221 Compare September 18, 2024 10:54

vfdev-5 marked this pull request as ready for review September 18, 2024 13:01

vfdev-5 requested review from ftynse, makslevental and stellaraccident as code owners September 18, 2024 13:01

llvmbot added mlir:python MLIR Python bindings mlir labels Sep 18, 2024

vfdev-5 changed the title ~~[WIP] Added free-threading CPython mode support in MLIR Python bindings~~ Added free-threading CPython mode support in MLIR Python bindings Sep 18, 2024

vfdev-5 marked this pull request as draft September 22, 2024 19:50

vfdev-5 force-pushed the free-threading-support branch 2 times, most recently from de21571 to cdd2b03 Compare November 5, 2024 21:38

hawkinsp reviewed Jan 6, 2025

View reviewed changes

mlir/test/python/multithreaded_tests.py Outdated Show resolved Hide resolved

mlir/test/python/multithreaded_tests.py Show resolved Hide resolved

vfdev-5 force-pushed the free-threading-support branch from 0e99c15 to 8051458 Compare January 7, 2025 11:31

vfdev-5 marked this pull request as draft January 7, 2025 11:31

vfdev-5 commented Jan 7, 2025

View reviewed changes

mlir/lib/Bindings/Python/ExecutionEngineModule.cpp Outdated Show resolved Hide resolved

vfdev-5 force-pushed the free-threading-support branch from 8051458 to 18656d9 Compare January 8, 2025 10:53

vfdev-5 force-pushed the free-threading-support branch from 18656d9 to b51be30 Compare January 8, 2025 11:17

vfdev-5 marked this pull request as ready for review January 8, 2025 11:20

vfdev-5 force-pushed the free-threading-support branch from b51be30 to 0ad1b5f Compare January 8, 2025 13:13

jpienaar reviewed Jan 10, 2025

View reviewed changes

mlir/test/python/multithreaded_tests.py Outdated Show resolved Hide resolved

mlir/test/python/multithreaded_tests.py Outdated Show resolved Hide resolved

mlir/test/python/multithreaded_tests.py Show resolved Hide resolved

vfdev-5 force-pushed the free-threading-support branch from 0ad1b5f to 19ea464 Compare January 10, 2025 23:32

jpienaar approved these changes Jan 12, 2025

View reviewed changes

jpienaar merged commit 9dee7c4 into llvm:main Jan 12, 2025
8 checks passed

jpienaar added a commit that referenced this pull request Jan 12, 2025

Revert "Added free-threading CPython mode support in MLIR Python bind…

3f1486f

…ings (#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

Mel-Chen pushed a commit to Mel-Chen/llvm-project that referenced this pull request Jan 13, 2025

Revert "Added free-threading CPython mode support in MLIR Python bind…

964aa03

…ings (llvm#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

vfdev-5 mentioned this pull request Jan 13, 2025

Enabled freethreading support in MLIR python bindings #122684

Merged

jpienaar pushed a commit that referenced this pull request Jan 13, 2025

Enabled freethreading support in MLIR python bindings (#122684)

f136c80

Reland reverted #107103 with the fixes for Python 3.8 cc @jpienaar Co-authored-by: Peter Hawkins <phawkins@google.com>

DKLoehr pushed a commit to DKLoehr/llvm-project that referenced this pull request Jan 17, 2025

Revert "Added free-threading CPython mode support in MLIR Python bind…

1f30573

…ings (llvm#107103)" Breaks on 3.8, rolling back to avoid breakage while fixing. This reverts commit 9dee7c4.

DKLoehr pushed a commit to DKLoehr/llvm-project that referenced this pull request Jan 17, 2025

Enabled freethreading support in MLIR python bindings (llvm#122684)

032f47e

Reland reverted llvm#107103 with the fixes for Python 3.8 cc @jpienaar Co-authored-by: Peter Hawkins <phawkins@google.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added free-threading CPython mode support in MLIR Python bindings #107103

Added free-threading CPython mode support in MLIR Python bindings #107103

vfdev-5 commented Sep 3, 2024 •

edited

Loading

github-actions bot commented Sep 3, 2024

llvmbot commented Sep 18, 2024

makslevental commented Sep 18, 2024

vfdev-5 commented Sep 19, 2024 •

edited

Loading

vfdev-5 commented Sep 22, 2024

stellaraccident commented Sep 22, 2024

stellaraccident commented Sep 22, 2024 •

edited

Loading

vfdev-5 commented Sep 22, 2024 •

edited

Loading

stellaraccident commented Sep 22, 2024 •

edited

Loading

vfdev-5 commented Sep 23, 2024 •

edited

Loading

stellaraccident commented Sep 23, 2024

vfdev-5 commented Sep 24, 2024

stellaraccident commented Sep 24, 2024

hawkinsp commented Sep 24, 2024

stellaraccident commented Sep 24, 2024

hawkinsp commented Sep 24, 2024

stellaraccident commented Sep 24, 2024

github-actions bot commented Jan 8, 2025 •

edited

Loading

github-actions bot commented Jan 8, 2025 •

edited

Loading

hawkinsp commented Jan 8, 2025

jpienaar left a comment

github-actions bot commented Jan 12, 2025

llvm-ci commented Jan 12, 2025

Added free-threading CPython mode support in MLIR Python bindings #107103

Added free-threading CPython mode support in MLIR Python bindings #107103

Conversation

vfdev-5 commented Sep 3, 2024 • edited Loading

github-actions bot commented Sep 3, 2024

llvmbot commented Sep 18, 2024

makslevental commented Sep 18, 2024

vfdev-5 commented Sep 19, 2024 • edited Loading

vfdev-5 commented Sep 22, 2024

stellaraccident commented Sep 22, 2024

stellaraccident commented Sep 22, 2024 • edited Loading

vfdev-5 commented Sep 22, 2024 • edited Loading

stellaraccident commented Sep 22, 2024 • edited Loading

vfdev-5 commented Sep 23, 2024 • edited Loading

stellaraccident commented Sep 23, 2024

vfdev-5 commented Sep 24, 2024

stellaraccident commented Sep 24, 2024

hawkinsp commented Sep 24, 2024

stellaraccident commented Sep 24, 2024

hawkinsp commented Sep 24, 2024

stellaraccident commented Sep 24, 2024

github-actions bot commented Jan 8, 2025 • edited Loading

github-actions bot commented Jan 8, 2025 • edited Loading

hawkinsp commented Jan 8, 2025

jpienaar left a comment

Choose a reason for hiding this comment

github-actions bot commented Jan 12, 2025

llvm-ci commented Jan 12, 2025

vfdev-5 commented Sep 3, 2024 •

edited

Loading

vfdev-5 commented Sep 19, 2024 •

edited

Loading

stellaraccident commented Sep 22, 2024 •

edited

Loading

vfdev-5 commented Sep 22, 2024 •

edited

Loading

stellaraccident commented Sep 22, 2024 •

edited

Loading

vfdev-5 commented Sep 23, 2024 •

edited

Loading

github-actions bot commented Jan 8, 2025 •

edited

Loading

github-actions bot commented Jan 8, 2025 •

edited

Loading