-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow prealloc and registration before calling C++ ctor #4116
base: master
Are you sure you want to change the base?
Conversation
First impression:
I'll click the Approve CI run button here in case you want to keep experimenting. |
@rwgk thanks for having triggered the CI and for the feedback! I'm targeting to get the same flexibility as Python For now, someone how wants to initialize Python object the same way from C++ code should do something like this (taking the same example as I took in the issue/discussion):
#include "pybind11/pybind11.h"
#include "pybind11/stl.h"
namespace py = pybind11;
class Base {
public:
Base() = default;
virtual ~Base() = default;
};
class BaseTrampoline : public Base {
public:
BaseTrampoline() = default;
void do_some_modifications() {
auto pyobj = py::cast(this);
pyobj.attr("bar") = 10.;
for (auto &n : pyobj.attr("__inputs__").cast<std::vector<std::string>>())
pyobj.attr(n.data()) = n;
}
};
PYBIND11_MODULE(test, m) {
py::class_<Base, BaseTrampoline>(m, "Base")
.def(py::init_alias<>() /*, py::preallocate()*/)
.def("do_some_modifications",
[](BaseTrampoline &t) { t.do_some_modifications(); });
}
from test import Base
class Derived(Base):
__inputs__ = ["a", "b"]
def __init__(self):
Base.__init__(self)
self.do_some_modifications()
d = Derived()
assert d.bar == 10.0
assert d.a == "a"
assert d.b == "b" or to simplify the syntax for end users: from test import Base as Base_
class Base(Base_):
def __init__(self):
Base_.__init__(self)
self.do_some_modifications()
class Derived(Base):
__inputs__ = ["a", "b"]
d = Derived()
assert d.bar == 10.0
assert d.a == "a"
assert d.b == "b" It requires:
I'll also try to improve the PR to get minimal changes, and add tests! |
General high-level cost-benefit thinking:
The benefit must justify the cost. If a change only shifts complexity in user code (e.g. from Python user code to C++ user code) it is not a net benefit, and I'd be very skeptical about accepting a cost. I'll look at the final version with that in mind. |
52cce3f
to
8332ca1
Compare
allocate memory before calling the ctor register the instance before calling the ctor use a placement new call to set the value_ptr use `prealloacte` ennotation to select placement new refactor construct_or_initialize to construct inplace add a type trait to check if a template param pack contains a type add tests add documentation
8332ca1
to
f16a7af
Compare
Fair enough @rwgk, I just :
It looks ready for review from my perspective, let me know if you have any question I'm happy to discuss about those changes! |
include/pybind11/detail/init.h
Outdated
typename... Args, | ||
detail::enable_if_t<std::is_constructible<Class, Args...>::value && Preallocate, int> = 0> | ||
inline void construct_or_initialize(value_and_holder &v_h, Args &&...args) { | ||
v_h.value_ptr() = malloc(sizeof(Class)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure we only 'Pymem_alloc*' apis for portability reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I need to find time to look at this carefully, but a quick search for "free" didn't match anything in the diff.
@adriendelsalle did you look already how the memory is deallocated? We want to be sure it is compatible with the allocation mechanism.
Purely from memory and hand-waving: I believe for old-style __init__
the current code can wind up with a delete
that has no matching new
(some other allocation mechanism is used). If you happen to stumble over that, too, adding a comment would be great. (I wanted to do that but never got back to it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome to have your comments!
I'm using the lib for years but only started to go in the details few days ago..so I'm not really comfortable with alloc/dealloc.
Is there some dev or design/architecture doc for that? I'll do my best to digest that tomorrow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some dev or design/architecture doc for that?
Not to my knowledge.
At least I found it necessary to learn the hard way, experimenting a lot with the code.
Feel free to ask questions here. Maybe we can learn together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing the only now, the valgrind failure seems to be spot-on:
https://github.com/pybind/pybind11/runs/7732180285?check_suite_focus=true
==7330== Mismatched free() / delete / delete []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that! Hopefully I added tests to cover the changes. I'll reproduce locally an debug that
Feel free to ask questions here. Maybe we can learn together.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just pushed a change to replace malloc
with new
allocation to be fully consistent with delete
used in dealloc
. It should be then very similar to what's done in the allocation+construction overload of construct_or_initialize
.
It's now all green using Valgrind
on the tests locally. I used act
to play the CI job.
I am pretty sure we only 'Pymem_alloc*' apis for portability reasons.
I don't really get the point, could you please elaborate a bit? I don't see why allocation and construction of a C++ object should use some Py*
alloc.
Here is my understanding of how the changes integrate with the implem:
- passing
py::preallocate()
as an extra argument of a new style constructor will preallocate and register just before actually calling the constructor of the class - if the call is fine, the holder is still not registered making this piece of code to be called (only to init the holder then):
pybind11/include/pybind11/pybind11.h
Lines 1127 to 1130 in bbb89da
if (overloads->is_constructor && !self_value_and_holder.holder_constructed()) { auto *pi = reinterpret_cast<instance *>(parent.ptr()); self_value_and_holder.type->init_instance(pi, nullptr); } - if the call is not successful (because of cast error AFAIU), the next overload try (if any) will deallocate the memory via operator
delete
if already allocated (no holder constructed yet):
pybind11/include/pybind11/pybind11.h
Lines 754 to 758 in bbb89da
if (func.is_new_style_constructor) { // The `value` may have been preallocated by an old-style `__init__` // if it was a preceding candidate for overload resolution. if (self_value_and_holder) { self_value_and_holder.type->dealloc(self_value_and_holder); - maybe it would worth to update the comment because now it's not only in case of old-style
__init__
that the memory is already allocated? - I don't really understand the comment:
pybind11/include/pybind11/pybind11.h
Lines 440 to 442 in bbb89da
} else if (rec->is_new_style_constructor && arg_index == 0) { // A new-style `__init__` takes `self` as `value_and_holder`. // Rewrite it to the proper class type.
since it looks like to me the opposite: a new-style__init__
takesvalue_and_holder
asself
, isn't it?
pybind11/include/pybind11/pybind11.h
Lines 761 to 762 in bbb89da
call.init_self = PyTuple_GET_ITEM(args_in, 0); call.args.emplace_back(reinterpret_cast<PyObject *>(&self_value_and_holder));
- maybe it would worth to update the comment because now it's not only in case of old-style
I spent a few minutes to look through the code, without too much attention to details. Technically this is a undoubtedly a high-quality PR, but ... From the new documentation:
To me this reads like an invitation to create a mess. (Even the existing The general worry: mismatches between the C++ side and the Python side of a given type. Similar to what I wrote before, mismatches are likely to be surprising and confusing, ultimately bug prone. Another metaphor that often crosses my mind: this PR creates another wrinkle in the carpet. Someone is bound to stumble over it, at scale every day. I'm not exactly opposed to adding This PR is at a scale that at least one or two other maintainers will have to approve. @henryiii, @Skylion007 what do you think? |
Thanks for taking time and the detailed feedback @rwgk , I understand the worries about creating complexity. I'm just trying to make it more straightforward for people having a quite intensive use of the lib to take advantage of C++ code in initialization of the Python object (vs the inefficient workaround I previously posted in this PR).
What do you mean? Here the C++ constructor will act just as a regular Python base class would do (create new attributes, set existing ones, do some processing/checks/whatever) But I can only speak from my experience and about my use cases, far from me the idea of adding confusion :). |
That's not common and not obvious. It will come as a surprise to anyone but the author of the code. People will wonder what's going on, and will have to dig in to find out that some non-bindings logic is hidden in the bindings layer. I can see that the feature this PR is enabling could be useful occasionally when migrating from one bindings tool to another, e.g. migrating from SWIG to pybind11, when both the C++ API and the Python API are more-or-less frozen already and there is no practical alternative to fully emulating the established behavior. If the documentation was written to emphasize that, but generally discourage packing non-bindings logic into the bindings layer, it would look much better to me. Another thought that crossed my mind in the meantime: do we actually need |
will do!
yes it would break classes with custom operators, see #948. specific tests have been written for that. I guess I also have to add a big warning about this non working combo. |
BTW, have we tested if this works with |
I don't know..I'll take a look, thanks for the pointer! |
pybind11/include/pybind11/pybind11.h Lines 692 to 694 in 2d59b43
So this PR is not changing anything on that: it's not possible to register a C++/Python couple if the Python or C++ instance doesn't exist yet
I don't get it sorry :/. Could you please elaborate on that?
Yes it could be an option, it also mean that the Python object needs to be allocated there to do the job of the pybind11/include/pybind11/detail/class.h Line 487 in 2d59b43
It can be also quite tricky to keep the existing behavior with C++ construction in
I don't really see how it's useful since there is no capability to act on the Python object instantiation from C++. For now, the C++ alloc/init is something totally independent from the Python one. Maybe to init the C++ object in Another point is that |
Also note that I currently didn't implemented the support of preallocation for |
Gentle ping @Skylion007 @rwgk |
Coming back to this:
Is that something we could handle with SFINAE ( Other thoughts:
Not being sure if there is at least one other maintainer to support this PR, I'm reluctant investing the time applying this pretty tricky PR to the smart_holder branch. Is that something you could try? I'm not sure if it will just work, or get us deep into the weeds. Note that the smart_holder tests are run twice: 1. with unique_ptr as the default holder (like master), 2. with smart_holder as the default holder. I think 1. will probably just work, but I'm not sure about 2. Currently the smart_holder branch is only a couple minor commits behind master. I'll update again sometime soon, but I don't think it'll matter for applying this PR to the branch. If you get this working without the |
I was thinking about this some more and really the main reason you want to prealloc is add Py::Objects to a dynamic dictionary after construction. So wouldn't a better solution would be to have an entry point for the python construction like we do the C++ construction. Or in other words, have some function that is evaluated lazily after the C++ object is constructed/initialized? This is already possible with factory methods. Really it sounds like the py::init() is doing the C++ initialization, which it maybe shouldn't be. |
Thanks both of you for the help/feedbacks! The post constructor hook would probably make it, I'll investigate! |
If you had rebase on smart_holder in mind: it's not a requirement, but it would help me a lot running our (Google's) global testing for this PR sooner rather than later. |
Description
This is a attempt to allow access to a Python derived class instance from C++ constructor.
At this time, the constructor is called before allocation and registration so
py::cast(this)
inside constructor doesn't return the expected Python object.This PR:
preallocate
annotation to be passed when declaring a constructor to select new specializationsconstruct_or_initialize
to construct C++ object inplace and add specializations (relying on SFINAE) to handle both preallocated and not preallocated (current and unchanged behavior) constructors:Closes #4114