Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MSVC MT/MD incompatibility in PYBIND11_BUILD_ABI #4953

Merged
merged 22 commits into from
Nov 10, 2024

Conversation

isuruf
Copy link
Contributor

@isuruf isuruf commented Nov 28, 2023

Description

The ABI is compatible between different versions of MSVC 14.* (2015 - 14.0, 2017 - 14.1, 2019 - 14.2, 2022 - 14.3).
However, the internal structures are different between these versions. For eg: the internal details of how C++ structures like maps are implemented might be different. (I'm not saying they are, but how maps are implemented is an implementation detail)

With /MT you are statically linking in the C runtime and when allocating one structure in a DLL created with a different MSVC version, runtime linked statically, and trying to use it in another DLL linked to a different MSVC runtime (static or dynamic) results in crashes. That's why when you link with /MT and is crossing DLL boundaries you need them to have the same version.

However, if you are using /MD you are using only one C runtime and the internal structures are consistent. Therefore you don't need to check MSVC version.

See https://learn.microsoft.com/en-us/cpp/porting/binary-compat-2015-2017?view=msvc-170


For increased safety, this PR also eliminates the fall-through case leaving PYBIND11_BUILD_ABI blank.

Suggested changelog entry:

Properly handle MSVC MT/MD incompatibility in `PYBIND11_BUILD_ABI`.

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2023

Could you please explain the rationale in the PR description?

IIUC, currently this PR undoes #4779 for /MD, is that the right thing to do?

Because ultimately we're sharing C++ pointers between Python extension modules. The C++ ABI must be compatible, independently of everything else. Falling back to define PYBIND11_BUILD_ABI "" (IIUC) seems very suspicious.

I don't know a lot about Windows linking and the various options. I'd appreciate if you could take the time to explain carefully in the PR description so that people like myself can clearly see what is correct and why. (So that we don't have to come back here again.)

@isuruf
Copy link
Contributor Author

isuruf commented Nov 28, 2023

IIUC, currently this PR undoes #4779 for /MD, is that the right thing to do?

Yes, the ABI is compatible between different versions of MSVC 14.* (2015 - 14.0, 2017 - 14.1, 2019 - 14.2, 2022 - 14.3).
However, the internal structures are different between these versions.

With /MT you are statically linking in the C runtime and when allocating one structure in a DLL created with a different MSVC version, runtime linked statically, and trying to use it in another DLL linked to a different MSVC runtime (static or dynamic) results in crashes. That's why when you link with /MT and is crossing DLL boundaries you need them to have the same version.

However, if you are using /MD you are using only one C runtime and the internal structures are consistent. Therefore you don't need to check MSVC version.

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2023

Yes, the ABI is compatible between different versions of MSVC 14.* (2015 - 14.0, 2017 - 14.1, 2019 - 14.2, 2022 - 14.3).

Amazing.

However, the internal structures are different between these versions.

Could you please clarify what the "internal structures" are?

If you move your comment with that clarification to the PR description it'll look good to me.

Please also point to this PR from the comment in internals.h (See (#4779) -> See (#4953)).

@wjakob this looks like something we want to keep in sync between pybind11 and nanobind.

Copy link
Collaborator

@rwgk rwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'm glad you caught this before we made a release.

@wjakob
Copy link
Member

wjakob commented Nov 28, 2023

I am not so sure about this one @rwgk @isuruf . Consider the following scenario:

  • User A builds an extension with an ancient version of MSVC.
  • User B builds an extension with a really recent one.
  • Once imported into the same Python interpreter, both the extension of user A and B will use the same runtime C library. This means that calls to functions like malloc() and free() are consistently handled by the same heap. So far so good.
  • Let's also say that this patch gets merged, and both of these users compile with /MD. This means that their extensions have the same pybind11 ABI key. So they are allowed to talk to each other via pybind11's internal data structures.

This is where we are leaving the world of C ABI compatibility world and entering the world of C++ ABI compatibility. The opener of the PR says:

However, the internal structures are different between these versions. For eg: the internal details of how C++ structures like maps are implemented might be different.

This is precisely the issue. And it is also the issue in the other related PR (#3793) and the CondaForge issue (conda-forge/pybind11-feedstock#79), where I am surprised about the nonchalance in introducing an (in my opinion) dangerous change.

After following the above scenario, each of the two pybind11 extensions will now want to read from and write to a shared internal data structure containing lots of nested STL types. It is a minefield -- even the slightest change in how those types are implemented will cause undefined behavior. Crashes in the simplest case, and silent data corruption on the worse side of the scale.

This is just the pybind11 internals -- perhaps one could go through through the entire MSVC C++ library and confirm that types used to implement pybind11 itself (std::type_index, std::string, std::pair, std::tuple, std::unordered_set, std::unordered_multimap, std::vector, std::forward_list, etc.) are indeed 100% ABI compatible between all relevant MSVC versions since they are in any case pretty old and hopefully stabilized.

But it does not stop there in my opinion: pybind11 has not just the job of protecting its own internals. It should also protect user data structures from ABI incompatibilities.The point of multiple extensions sharing a pybind11 "domain" is so that functions in these extensions can call each other with instances across libraries.

Perhaps both users A and B created an extension that includes the header

struct Foo {
   std::fancy_cpp20_type<...> foo;
};

Well, if user A's extension creates an instance of the type Foo and then passes it to a function in user B's extension, then this can still crash if std::fancy_cpp20_type isn't ABI compatible between the MSC versions that built these libraries.

(case in point, C++20 features were exposed in MSVC but were not API stable yet microsoft/STL#1814 (comment))

I will say what I said earlier in the two other commits: most pybind11 extensions don't need to need type-level access to the internals of other pybind11 extensions. They can be isolated and everything works fine.

It is just a small subset of extensions that have such an "intimate" relationship. I think it is reasonable to expect that such extensions are built with C++ libraries that declare themselves as being ABI-compatible.

I know that this can sometimes be inconvenient. But weakening the pybind11 protection mechanism is not the right solution to this problem.

@isuruf
Copy link
Contributor Author

isuruf commented Nov 28, 2023

Are there any pybind developer calls where we can discuss these issues? I feel like it would be easier to discuss these on a call and summarize the findings.

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2023

After following the above scenario, each of the two pybind11 extensions will now want to read from and write to a shared internal data structure containing lots of nested STL types.

I didn't think that through before. I'm trying to make that concrete for myself by looking at the sources. I think this is the door through which the potential for UB slips in:

Here extension module B is extracting a pointer to an object of this struct created in extension module A:

  • struct internals {
    // std::type_index -> pybind11's type information
    type_map<type_info *> registered_types_cpp;
    // PyTypeObject* -> base type_info(s)
    std::unordered_map<PyTypeObject *, std::vector<type_info *>> registered_types_py;
    std::unordered_multimap<const void *, instance *> registered_instances; // void * -> instance*
    std::unordered_set<std::pair<const PyObject *, const char *>, override_hash>
    inactive_override_cache;
    type_map<std::vector<bool (*)(PyObject *, void *&)>> direct_conversions;
    std::unordered_map<const PyObject *, std::vector<PyObject *>> patients;
    std::forward_list<ExceptionTranslator> registered_exception_translators;
    std::unordered_map<std::string, void *> shared_data; // Custom data to be shared across
    // extensions
    #if PYBIND11_INTERNALS_VERSION == 4
    std::vector<PyObject *> unused_loader_patient_stack_remove_at_v5;
    #endif
    std::forward_list<std::string> static_strings; // Stores the std::strings backing
    // detail::c_str()
    PyTypeObject *static_property_type;
    PyTypeObject *default_metaclass;
    PyObject *instance_base;
    #if defined(WITH_THREAD)
    // Unused if PYBIND11_SIMPLE_GIL_MANAGEMENT is defined:
    PYBIND11_TLS_KEY_INIT(tstate)
    # if PYBIND11_INTERNALS_VERSION > 4
    PYBIND11_TLS_KEY_INIT(loader_life_support_tls_key)
    # endif // PYBIND11_INTERNALS_VERSION > 4
    // Unused if PYBIND11_SIMPLE_GIL_MANAGEMENT is defined:
    PyInterpreterState *istate = nullptr;
    # if PYBIND11_INTERNALS_VERSION > 4
    // Note that we have to use a std::string to allocate memory to ensure a unique address
    // We want unique addresses since we use pointer equality to compare function records
    std::string function_record_capsule_name = internals_function_record_capsule_name;
    # endif
    internals() = default;
    internals(const internals &other) = delete;
    internals &operator=(const internals &other) = delete;
    ~internals() {
    # if PYBIND11_INTERNALS_VERSION > 4
    PYBIND11_TLS_FREE(loader_life_support_tls_key);
    # endif // PYBIND11_INTERNALS_VERSION > 4
    // This destructor is called *after* Py_Finalize() in finalize_interpreter().
    // That *SHOULD BE* fine. The following details what happens when PyThread_tss_free is
    // called. PYBIND11_TLS_FREE is PyThread_tss_free on python 3.7+. On older python, it does
    // nothing. PyThread_tss_free calls PyThread_tss_delete and PyMem_RawFree.
    // PyThread_tss_delete just calls TlsFree (on Windows) or pthread_key_delete (on *NIX).
    // Neither of those have anything to do with CPython internals. PyMem_RawFree *requires*
    // that the `tstate` be allocated with the CPython allocator.
    PYBIND11_TLS_FREE(tstate);
    }
    #endif
    };

If the <vector> etc. headers seen by B are different from those seen by A, we have UB if B is operating on the internals object.

Sounds convincing to me.

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2023

Are there any pybind developer calls where we can discuss these issues?

We're in vastly different timezones. I think sorting this out via comments here will work best. (We didn't have a meeting in a long time.)

@isuruf
Copy link
Contributor Author

isuruf commented Nov 28, 2023

@rwgk, that wouldn't happen as the allocation of internal structures happen in the same C runtime. If the external view of the object changes, then the stdlib developers would indicate this so that things wouldn't link at all. For eg: the std::string ABI changed from g++ 4 to g++ 5. That's the external view of the object. The internal view of the object does not matter unless you are statically linking in the C runtime.

@isuruf
Copy link
Contributor Author

isuruf commented Nov 28, 2023

(case in point, C++20 features were exposed in MSVC but were not API stable yet microsoft/STL#1814 (comment))

This is a valid point. API stability in the case of experimental features is an issue in all compilers.

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2023

then the stdlib developers would indicate this so that things wouldn't link at all.

That's a cool feature (I didn't know about it), but I don't think we have that protection:

Here the internals** is cast to a void* (stored in a Python capsule):

state_dict[PYBIND11_INTERNALS_ID] = capsule(internals_pp);

And here the void* is cast back to internals**:

return static_cast<internals **>(raw_ptr);

That by-passes any link checks. The extension modules are "connected" only at runtime.

Is it possible to have a runtime check for e.g. std::vector compatibility between MSVC versions? (To substitute for the link-time check?) (And is is simple enough, so that gain / cost is reasonable?)

@isuruf
Copy link
Contributor Author

isuruf commented Nov 28, 2023

You can use typeid(func).name() before the type erasure to get the mangled name.

@rwgk
Copy link
Collaborator

rwgk commented Nov 28, 2023

You can use typeid(func).name() before the type erasure to get the mangled name.

Is that documented to be unique to a given "external view"?

(The mangled names I happened to see over the years didn't appear to encode any versioning information.)

@rwgk
Copy link
Collaborator

rwgk commented Nov 29, 2023

After turning this over in the back of my mind a little bit:

I think it's clear that falling back to define PYBIND11_BUILD_ABI "" is unsafe.

But it also sounds like using define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER) is more restrictive than it needs to be.

Could we find a middle ground? Maybe, @isuruf do you think you could add preprocessor-level functionality to map _MSC_VER to something like a PYBIND11_MSVC_ABI_COMPATIBILITY_GROUP, which we then use for PYBIND11_BUILD_ABI?

#ifndef PYBIND11_BUILD_ABI
# if defined(__GXX_ABI_VERSION)
# define PYBIND11_BUILD_ABI "_cxxabi" PYBIND11_TOSTRING(__GXX_ABI_VERSION)
# elif defined(_MSC_VER)
# define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER)
# elif defined(_MSC_VER) && defined(_MT)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if the assumption that all _MSC_VER >= 1900 && _MSC_VER < 2000 are fully binary compatible is valid, but I'm willing to take your word for it. I'll defer to @wjakob for make the final call on this one.

It would be ideal to reduce the redundancy and avoid the define PYBIND11_BUILD_ABI "" fallback. (IMO it would be good to get rid of that fallback entirely, but not in this PR.) How about this?

#    elif defined(_MSC_VER)
#        if defined(_MT)
#            define PYBIND11_BUILD_ABI "_mt_mscver" PYBIND11_TOSTRING(_MSC_VER)
#        elif defined(_MD)
#            if _MSC_VER >= 1900 && _MSC_VER < 2000
#                define PYBIND11_BUILD_ABI "_md_mscver14"
#            else
#                define PYBIND11_BUILD_ABI "_md_mscver" PYBIND11_TOSTRING(_MSC_VER)
#            else
#                define PYBIND11_BUILD_ABI "_unkown_mscver" PYBIND11_TOSTRING(_MSC_VER)
#            endif
#        endif

I'm not sure about the _unkown_mscver fallback. Maybe #error Unkown combination of MSVC macros (or similar) would be better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using _MSC_VER/100 now to get the major version. See https://learn.microsoft.com/en-us/cpp/overview/compiler-versions?view=msvc-170 on how _MSC_VER is formatted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ask around to find someone familiar with the MSVC ABI universe.

It's important that we get this right this time around, because everytime we make a change here evidently we're creating a big problem for conda-forge.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to ask, did you consider the idea of preserving "_mscver" for one branch of the #if defined logic?

#4953 (comment)

The idea is to minimize ABI breakages.

I'm just not sure which branch is the best one to pick.

@wjakob
Copy link
Member

wjakob commented Dec 1, 2023

Hi @isuruf and @rwgk,

I had a bit of time look a bit into C++ ABI stability in the last few days. I think that I am too paranoid after having been repeatedly bitten by this some years ago. It seems that basically all the three main compiler tools (GCC, Clang, MSVC) have started taking ABI stability seriously at some point in the last years and there haven't been major breaks since then.

So I'm open to making a change here that allows interoperability between more extensions built with different compilers. But let's get it right and make something that we hopefully won't have to tweak retroactively because the criterion turns out to be too loose and, e.g., and ancient MSVC version turns out to not be ABI-compatible after all.

One important think to keep in mind (@rwgk) is that a change here is itself an ABI break, of pybind11. Modifying the PYBIND11_BUILD_ABI string means that the extension is now isolated from all other extensions previously built with a different PYBIND11_BUILD_ABI string. So we should put this change into a larger release that bundles accumulated ABI-breaking commits.

-Wenzel

@rwgk
Copy link
Collaborator

rwgk commented Jul 3, 2024

@isuruf: this PR was mentioned under conda-forge/pybind11-feedstock#95

This PR slipped my attention. We missed the pybind11 v2.13.0 release. @isuruf do you have an opinion when this is best merged? Would it make sense to tweak this PR, to avoid further ABI breaks? E.g. currently the diff under this PR is:

-#        define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER)
+#    elif defined(_MSC_VER) && defined(_MT)
+#        define PYBIND11_BUILD_ABI "_mt_mscver" PYBIND11_TOSTRING(_MSC_VER)
+#    elif defined(_MSC_VER) && defined(_MD) && (_MSC_VER >= 1900) && (_MSC_VER < 2000)
+#        define PYBIND11_BUILD_ABI "_md_mscver14"
+#    elif defined(_MSC_VER) && defined(_MD)
+#        define PYBIND11_BUILD_ABI "_md_mscver" PYBIND11_TOSTRING(_MSC_VER)

could this (note the <<<< LOOK HERE)

-#        define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER)
+#    elif defined(_MSC_VER) && defined(_MT)
+#        define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER)  <<<< LOOK HERE
+#    elif defined(_MSC_VER) && defined(_MD) && (_MSC_VER >= 1900) && (_MSC_VER < 2000)
+#        define PYBIND11_BUILD_ABI "_md_mscver14"
+#    elif defined(_MSC_VER) && defined(_MD)
+#        define PYBIND11_BUILD_ABI "_md_mscver" PYBIND11_TOSTRING(_MSC_VER)

or this

-#        define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER)
+#    elif defined(_MSC_VER) && defined(_MT)
+#        define PYBIND11_BUILD_ABI "_mt_mscver" PYBIND11_TOSTRING(_MSC_VER)
+#    elif defined(_MSC_VER) && defined(_MD) && (_MSC_VER >= 1900) && (_MSC_VER < 2000)
+#        define PYBIND11_BUILD_ABI "_md_mscver14"
+#    elif defined(_MSC_VER) && defined(_MD)
+#        define PYBIND11_BUILD_ABI "_mscver" PYBIND11_TOSTRING(_MSC_VER)  <<<< LOOK HERE

be better?

@rwgk
Copy link
Collaborator

rwgk commented Jul 3, 2024

@henryiii @Skylion007 for visitility (see my other comment posted a minute ago)

@rwgk
Copy link
Collaborator

rwgk commented Oct 28, 2024

NVHPC is actively maintained and new version come out every couple of months.

Thanks! I just revised the comment (713b427). — Coincidentally I was working on revising the misleading comment before I saw your comment here, after asking my teammates about it.

include/pybind11/detail/internals.h Outdated Show resolved Hide resolved
#ifndef PYBIND11_BUILD_ABI
# if defined(__GXX_ABI_VERSION)
# if defined(__GXX_ABI_VERSION) // Linux/OSX.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this PR is for msvc but this isn't the correct way to check for linux.

Background:
This macro value is the same across all version of clang ( always 1002 ) but changes anytime the gcc internal compiler ABI changes, which generally has no impact. You can find more details on the gcc abi changes here: https://gcc.gnu.org/onlinedocs/gcc/C_002b_002b-Dialect-Options.html

Generally you can safely pass types across anything 1002 or newer as long as the _GLIBCXX_USE_CXX11_ABI ( https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html ) is the same.

So if you build two libraries with differing values of _GLIBCXX_USE_CXX11_ABI they can't have the same abi:

~  $ clang++-16 -D_GLIBCXX_USE_CXX11_ABI=0 ./test.cpp -shared && nm ./a.out | grep foo | c++filt 
00000000000011a0 T foo(std::basic_string<char, std::char_traits<char>, std::allocator<char> >)
~  $ clang++-16 -D_GLIBCXX_USE_CXX11_ABI=1 ./test.cpp -shared && nm ./a.out | grep foo | c++filt
00000000000011a0 T foo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)

But that isn't captured in your ABI checks at all and is significantly more important compared to __GXX_ABI_VERSION

# else
# error "Unknown combination of MSVC preprocessor macros: PLEASE REVISE THIS CODE."
# endif
# elif defined(__NVCOMPILER) // NVHPC (PGI-based).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For nvhpc you would check _GLIBCXX_USE_CXX11_ABI like you would for gcc or clang. You can combo that with __GNUC__ and __GNUC_MINOR__ to get the gcc compatible version that nvhpc is targetting. IIRC the behavior of nvhpc is always -fabi-version=0 where the latest is defined by the gcc it is targetting ( e.g. GNUC , GNUC_MINOR defines ).

Co-authored-by: Robert Maynard <robertjmaynard@gmail.com>
wjakob added a commit to wjakob/nanobind that referenced this pull request Nov 5, 2024
This commit incorporates some of the suggestions from pybind11 PR
pybind/pybind11#4953 (and linked discussion) to
relax ABI tagging for libstdc++ and MSVC. At the same time, it adds an
extra ABI tag for libc++
(https://libcxx.llvm.org/DesignDocs/ABIVersioning.html) that was
previously missing.
@wjakob
Copy link
Member

wjakob commented Nov 5, 2024

Hi all -- the changes for MSVC look good to me. However, as @robertmaynard has mentioned, this is also an opportunity to relax the ABI tagging for libstdc++, and that should be done in the same ABI break.

While looking into ABIs for different compilers, I also found this for libc++: https://libcxx.llvm.org/DesignDocs/ABIVersioning.html. Setting a higher version causes some incompatibilities with standard C++ containers. IMO, pybind11 should also check for this.

I started working on an analogous set of changes for nanobind here: wjakob/nanobind#778

wjakob added a commit to wjakob/nanobind that referenced this pull request Nov 5, 2024
This commit incorporates some of the suggestions from pybind11 PR
pybind/pybind11#4953 (and linked discussion) to
relax ABI tagging for libstdc++ and MSVC. At the same time, it adds an
extra ABI tag for libc++
(https://libcxx.llvm.org/DesignDocs/ABIVersioning.html) that was
previously missing.
wjakob added a commit to wjakob/nanobind that referenced this pull request Nov 5, 2024
This commit incorporates some of the suggestions from pybind11 PR
pybind/pybind11#4953 (and linked discussion) to
relax ABI tagging for libstdc++ and MSVC. At the same time, it adds an
extra ABI tag for libc++
(https://libcxx.llvm.org/DesignDocs/ABIVersioning.html) that was
previously missing.
wjakob added a commit to wjakob/nanobind that referenced this pull request Nov 5, 2024
This commit incorporates some of the suggestions from pybind11 PR
pybind/pybind11#4953 (and linked discussion) to
relax ABI tagging for libstdc++ and MSVC. The goal is to ensure that
ABI-incompatible extensions are separated from each other. These checks
were previously too fine-grained and caused isolation in cases where an
actual ABI incompatibility was not present. This is a long-standing
problem in both pybind11 and nanobind causing inconvenience for users.

While looking into this topic, I realized that libc++ has an opt-in
unstable ABI feature
(https://libcxx.llvm.org/DesignDocs/ABIVersioning.html). The PR also
adds checks for this.

Merging this PR will imply an ABI version bump for nanobind due to the
differnt naming convention of the associated ABI tag.
wjakob added a commit to wjakob/pybind11 that referenced this pull request Nov 10, 2024
This commit adapts the ABI tag as follows:

- It removes ``PYBIND11_INTERNALS_KIND`` that was neither used nor documented.
- It adapts the MSVC ABI tag to be less stringent (see PR pybind#4953)
- It adapts the GCC ABI tag to be less stringent.
- It adds a check for a Clang/libc++ ABI tag that wasn't present before

I plan to make a consistent set of changes to nanobind so that both
libraries use interchangeable platform ABI tags.
@wjakob
Copy link
Member

wjakob commented Nov 10, 2024

Hi all -- I created a PR with further changes in PR #5437 that also address some of the other compilers/ABIs. If this proposal sounds good, it presents a useful opportunity to make pybind11 and nanobind use the same platform tag--not yet useful on its own, but a first step towards a potential feature to safely dispatch function calls with common types between pybind11&nanobind.

@rwgk
Copy link
Collaborator

rwgk commented Nov 10, 2024

Hi all -- I created a PR with further changes in PR #5437 that also address some of the other compilers/ABIs. If this proposal sounds good, it presents a useful opportunity to make pybind11 and nanobind use the same platform tag--not yet useful on its own, but a first step towards a potential feature to safely dispatch function calls with common types between pybind11&nanobind.

Awesome, thanks!

I will merge this now and then merge the new master into your PR.

The other platforms are orthogonal to MSVC. Having separate PRs makes it easier for us to focus now, and easier for others to follow the discussions in the future.

I will also update my PR #5375, which is meant to make it as easy as possible for other bindings systems to reuse the ABI guarding code.

@rwgk rwgk merged commit ec9c268 into pybind:master Nov 10, 2024
81 checks passed
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label Nov 10, 2024
rwgk added a commit to rwgk/pybind11 that referenced this pull request Nov 10, 2024
rwgk added a commit to rwgk/pybind11 that referenced this pull request Nov 10, 2024
This merge commit required manual intervention to integrate the changes from pybind#4953
rwgk added a commit to rwgk/pybind11 that referenced this pull request Nov 10, 2024
rwgk added a commit that referenced this pull request Dec 20, 2024
…han MSVC) (#5439)

* THIS IS JUST A START: First attempt to combine information from PR #4953 and PR #5437

* Include GXX_ABI and USE_CXX in the identifier

Further constrain to GXX_ABI 1002 or greater and less than 2000,
hopefully future proof by summarizing that as `1` along with CXX11 on or
off.

* style: pre-commit fixes

* Use `gxx_abi_1xxx` and simplify the Clang string

After discussions with Ralf Grosse-Kunstleve we think these would make
good identifiers that are concise and clear.

* Error if `_GLIBCXX_USE_CXX11_ABI` is not defined

Within the `__GXX_ABI_VERSION` block this should always be defined,
guard against unexpected defines and make the error obvious.

* Change `usecxx11` to `use_cxx11_abi` for correspondence with `_GLIBCXX_USE_CXX11_ABI` (similarly to `gxx_abi` for `__GXX_ABI_VERSION`).

* `PYBIND11_COMPILER_TYPE` overhaul, mainly: replace `_icc`, `_clang`, `_gcc` with `_system`

* Add NVHPC (__PGI) to the list of compilers compatible with system compiler.

See comment by @robertmaynard: #5439 (comment)

* Fix oversight: remove __NVCOMPILER elif branch in PYBIND11_BUILD_ABI block.

Also add comment pointing to this PR (#5439).

* Revert "Fix oversight: remove __NVCOMPILER elif branch in PYBIND11_BUILD_ABI block."

This reverts commit d412303.

* Revert "Add NVHPC (__PGI) to the list of compilers compatible with system compiler."

This reverts commit 9fc9515.

* Define NVHPC PYBIND11_BUILD_ABI using __GNUC__, __GNUC_MINOR__, _GLIBCXX_USE_CXX11_ABI

* Use _GLIBCXX_USE_CXX11_ABI to detect libstdc++, then assume that NVHPC is always in the 1xxx ABI family.

* Enhance NVHPC comment and limited future proofing.

* The `PYBIND11_STDLIB` is obsolete but kept around to maintain backward compatibility.

* Move `PYBIND11_BUILD_TYPE` down in the file, so that the order of macro definitions is the same as in the list defining `PYBIND11_PLATFORM_ABI_ID`

* Introduce `PYBIND11_COMPILER_TYPE_LEADING_UNDERSCORE`:

This makes it possible to achieve these two goals:

* Avoid the leading underscore in `PYBIND11_PLATFORM_ABI_ID` (see #5439 (comment))

* Maintain backward compatibility for use cases as reported under #5439 (comment)

`PYBIND11_INTERNALS_KIND` is removed in this commit to ensure that `PYBIND11_COMPILER_TYPE` is the first element of the `PYBIND11_PLATFORM_ABI_ID`, so that `PYBIND11_COMPILER_TYPE_LEADING_UNDERSCORE` can meaningfully be used as a prefix for `PYBIND11_PLATFORM_ABI_ID` in pybind11/detail/internals.h.

* Apply suggestion by @isuruf, with revised comments (code is as suggested).

* Make determination of `PYBIND11_COMPILER_TYPE` `"macos"` or `"glibc"` more general.

The main motivation is to resolve these "Manylinux on 🐍 3.13t • GIL" and "Pyodide wheel" failures:

```
/__w/pybind11/pybind11/include/pybind11/conduit/pybind11_platform_abi_id.h:35:10: error: #error "Unknown PYBIND11_COMPILER_TYPE: PLEASE REVISE THIS CODE."
   35 | #        error "Unknown PYBIND11_COMPILER_TYPE: PLEASE REVISE THIS CODE."
      |          ^~~~~
```

(All other CI jobs succeeded.)

Further thought:

Essentially, under Linux and macOS the `PYBIND11_COMPILER_TYPE` is only for informational purposes.
ABI compatibility is determined by the libstdc++ or libc++ ABI version.

* Add `PYBIND11_COMPILER_TYPE` `emscripten`

* Add `PYBIND11_COMPILER_TYPE` `graalvm`

* Revert "Add `PYBIND11_COMPILER_TYPE` `graalvm`"

This reverts commit 75da5fb.

* Revert "Add `PYBIND11_COMPILER_TYPE` `emscripten`"

This reverts commit e34dc8b.

* Revert "Make determination of `PYBIND11_COMPILER_TYPE` `"macos"` or `"glibc"` more general."

This reverts commit 41daaa4.

* Revert "Apply suggestion by @isuruf, with revised comments (code is as suggested)."

This reverts commit ca9e699.

* Remove `defined(__INTEL_COMPILER)` as suggested by @hpkfft under #5439 (comment)

---------

Co-authored-by: Marcus D. Hanwell <marcus@cryos.net>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
wjakob added a commit to wjakob/nanobind that referenced this pull request Jan 7, 2025
This commit incorporates some of the suggestions from pybind11 PR
pybind/pybind11#4953 (and linked discussion) to
relax ABI tagging for libstdc++ and MSVC. The goal is to ensure that
ABI-incompatible extensions are separated from each other. These checks
were previously too fine-grained and caused isolation in cases where an
actual ABI incompatibility was not present. This is a long-standing
problem in both pybind11 and nanobind causing inconvenience for users.

While looking into this topic, I realized that libc++ has an opt-in
unstable ABI feature
(https://libcxx.llvm.org/DesignDocs/ABIVersioning.html). The PR also
adds checks for this.

Merging this PR will imply an ABI version bump for nanobind due to the
different naming convention of the associated ABI tag.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs changelog Possibly needs a changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants