Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Segmentation Fault or Bus error with dynamic_attr #279

Closed
zhqrbitee opened this issue Aug 23, 2023 · 1 comment
Closed

[BUG]: Segmentation Fault or Bus error with dynamic_attr #279

zhqrbitee opened this issue Aug 23, 2023 · 1 comment

Comments

@zhqrbitee
Copy link
Contributor

zhqrbitee commented Aug 23, 2023

Problem description

Hi, I'm hit a segmentation fault or bus error raised from Python and below is a minimal code snapshot to reproduce it:

#include <unordered_map>
#include <string>

#include "nanobind/nanobind.h"
#include "nanobind/stl/shared_ptr.h"
#include "nanobind/stl/string.h"

namespace nb = nanobind;
using namespace nb::literals;

class Component {
 public:
  Component() = default;
  virtual ~Component() = default;
};

class Param: public Component {
 public:
  Param() = default;
};

class Model: public Component {
 public:
  Model() = default;

  void add_param(const std::string& name, std::shared_ptr<Param> p) {
    params_[name] = std::move(p);
  }
  std::shared_ptr<Param> get_param(const std::string& name) {
    if (params_.contains(name)) {
      return params_[name];
    }
    return nullptr;
  }
 private:
  std::unordered_map<std::string, std::shared_ptr<Param>> params_;
};

class ModelA: public Model {
 public:
  ModelA() {
    add_param("a", std::make_shared<Param>());
    add_param("b", std::make_shared<Param>());
  }
};

NB_MODULE(test_nanobind_dynamic_attr, m) {
  nanobind::set_leak_warnings(false);

  nb::class_<Component>(m, "Component");
  nb::class_<Param, Component>(m, "ParamBase");
  nb::class_<Model, Component>(m, "Model", nb::dynamic_attr()).def(nb::init<>{})
    .def("_get_param", &Model::get_param, "name"_a)
    .def("_add_param", &Model::add_param, "name"_a, "p"_a);
  nb::class_<ModelA, Model>(m, "ModelA").def(nb::init<>{});
}
from test_nanobind_dynamic_attr import Model, ModelA

def _get_parameter(self: Model, key: str):
    p = self._get_param(key)
    if p is not None:  # cache it for fast access later
        setattr(self, key, p)
        return p
    raise AttributeError(f"'key' not found in {self}")

# when an attribute doesn't exist, check if it is parameter in Model, if so, get it
Model.__getattr__ = _get_parameter

# Get segmentation fault or bus error. If I comment out below lines, I got :
#   SystemError: Objects/dictobject.c:1464: bad argument to internal function
#
# def _print_model(self):
#     return f"{self.__class__.__qualname__}()"
#
# Model.__str__ = _print_model

class Top(Model):
    def __init__(self):
        super().__init__()
        self.model_a = ModelA()

top = Top()
print(top.model_a)
print(top.model_a.a)  # "a" is a param added in C++, so we should get it through the `_get_parameter` function

Here is some output from ASAN/UBSAN build of Python and Nanobind:

 python3 benchmarks.py
ModelA()
Include/object.h:502:9: runtime error: member access within misaligned address 0xcdcdcdcdcdcdcdcd for type 'PyObject' (aka 'struct _object'), which requires 8 byte alignment
0xcdcdcdcdcdcdcdcd: note: pointer points here
<memory cannot be printed>
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior Include/object.h:502:9 in
AddressSanitizer:DEADLYSIGNAL
=================================================================
==2517==ERROR: AddressSanitizer: SEGV on unknown address 0xffffba29b9bbb9b9 (pc 0x0001011bb94c bp 0x00016f149610 sp 0x00016f149560 T0)
==2517==The signal is caused by a READ memory access.
    #0 0x1011bb94c in _PyObject_GenericGetAttrWithDict object.c:1317
    #1 0x1011bb36c in PyObject_GenericGetAttr object.c:1368
    #2 0x101271478 in slot_tp_getattr_hook typeobject.c:7706
    #3 0x1011b9c48 in PyObject_GetAttr object.c:916
    #4 0x101482e30 in _PyEval_EvalFrameDefault ceval.c:3466
    #5 0x101466ce0 in _PyEval_EvalFrame pycore_ceval.h:73
    #6 0x1014665dc in _PyEval_Vector ceval.c:6439
    #7 0x10146606c in PyEval_EvalCode ceval.c:1154
    #8 0x10163af30 in run_eval_code_obj pythonrun.c:1714
    #9 0x101636aa0 in run_mod pythonrun.c:1735
    #10 0x1016321e4 in _PyRun_SimpleFileObject pythonrun.c:440
    #11 0x10163199c in _PyRun_AnyFileObject pythonrun.c:79
    #12 0x1016c1700 in Py_RunMain main.c:680
    #13 0x1016c2be4 in pymain_main main.c:710
    #14 0x1016c2f8c in Py_BytesMain main.c:734
    #15 0x100cb7684 in main python.c:15
    #16 0x18b807f24  (<unknown module>)

==2517==Register values:
 x[0] = 0x000000016f149478   x[1] = 0x0000000000000000   x[2] = 0x0000000000000000   x[3] = 0x0000000107d007a0
 x[4] = 0x0000000063000000   x[5] = 0x0000000000000000   x[6] = 0x0000000000000000   x[7] = 0x0000000000000000
 x[8] = 0x00000001037c2000   x[9] = 0x000000000000d810  x[10] = 0x00000000000d4230  x[11] = 0x000000000000007d
x[12] = 0xcdcdcdcdcdcdcdcd  x[13] = 0xffffff0000000000  x[14] = 0x0000000000000000  x[15] = 0x0000000000000000
x[16] = 0x00000003077fa0b4  x[17] = 0x0000000102d980a0  x[18] = 0x0000000000000000  x[19] = 0x00000001021ef6c0
x[20] = 0x0000000105b3ce80  x[21] = 0x19b9b9b9b9b9b9b9  x[22] = 0x0000000000000001  x[23] = 0x0000000000000001
x[24] = 0x0000000105e320a0  x[25] = 0x0000000000000000  x[26] = 0x0000000000000000  x[27] = 0x0000000000000000
x[28] = 0x0000007000020000     fp = 0x000000016f149610     lr = 0x00000001011bbf58     sp = 0x000000016f149560
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV object.c:1317 in _PyObject_GenericGetAttrWithDict
==2517==ABORTING
fish: Job 1, 'python3 benchmarks.py' terminated by signal SIGABRT (Abort)

In my example, I put the nb::dynamic_attr() on the base class Model, as this is our usage (user inherits it to created derived ModelA). If I move the nb::dynamic_attr() from the base class Model to the derived class ModelA, the code seems work fine. So I guess it must be something related with nb::dynamic_attr() and inheritance.

My environment is: M1 (Arm64) with macOS Ventura, Python 3.9.17, with nanobind 1.5.1.

Reproducible example code

See the codes in the problem description.
wjakob added a commit that referenced this issue Aug 24, 2023
The ``nb_type_new()`` function of nanobind assumed that classes created
via ``PyType_Read()`` / ``PyType_FromMetaclass()`` would inherit the
value of the ``Py_TPFLAGS_HAVE_GC`` flag from the base class, but this
was only true when the new class did not at the same time provide its
own ``tp_traverse``/``tp_clear`` functions.

An implication of this mistake was that classes deriving from classes
with the ``nb::dynamic_attr`` were broken and would cause segfaults.

Fixes issue #279.
@wjakob
Copy link
Owner

wjakob commented Aug 24, 2023

Thank you for the report, this is now fixed. You can should also be able to remove the hacky nanobind::set_leak_warnings(false); line after upgrading to the latest version that I am about to release.

@wjakob wjakob closed this as completed Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants