Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: force buffer protocol in tensor creation for numpy to avoid read/write constraints with dlpack. #56

Closed
wants to merge 3 commits into from

Conversation

brettc
Copy link

@brettc brettc commented Jul 28, 2022

This PR overcomes two current limitations with tensors in nanobind when using numpy (it does not affect torch or jax).

  • Allow returning writeable arrays from c++ to python.
  • Allow passing read-only arrays to from python to c++.
  • clean up tests.
  • Add documentation.

@brettc
Copy link
Author

brettc commented Jul 28, 2022

@wjakob I've added an nb::writeable flag in tensor creation so far. Let me know if this is on track.

@qnzhou
Copy link
Contributor

qnzhou commented Aug 3, 2022

I can confirm that nb::writeable works in my simple test cases.

@brettc
Copy link
Author

brettc commented Aug 25, 2022

There is now a readonly flag that allows read-only numpy arrays to be passed through. It forces the translation through the buffer protocol, and has a static-assert guard so that it should only be combined with numpy.

Both issues are solved, but I wonder if there is a way of combining the readonly/writeable flags. I could not see how to do it as they really serve in different places. For clarity, maybe they should be accepts_readonly and returns_writeable.

@brettc brettc force-pushed the numpy-buffer-bypass branch from 0f898e2 to 94bbb89 Compare August 25, 2022 05:45
@qnzhou
Copy link
Contributor

qnzhou commented Aug 25, 2022

Based on the table below, we only need buffer protocal for 2 cases: python readonly to C++ readonly, and C++ readwrite to python readwrite.

C++ Python C++ -> Python Python -> C++
RO RO dlpack Buffer Protocol
RO RW invalid dlpack
RW RO dlpack invalid
RW RW Buffer Protocol dlpack

It will be great to use the same flag for both cases. Having both writable and readonly may leads to inconsistency settings. It may be helpful to consider writable a property of the nb::tensor obj rather than the C++ or python buffer it binds to. This may help resolve your semantic concern.

@brettc
Copy link
Author

brettc commented Aug 25, 2022

Sorry, I don't understand the table -- it looks misleading. Currently nanobind throws an exception if you send it a RO buffer from python. So neither RO -> RO or RO -> RW from python -> C++ work.

Any readonly flag in C++ (at least for numpy arrays) is simply a warning to the programmer, unless we force the array to be const. Perhaps that is an option, but my c++ templating skills are already stretched!

Maybe @wjakob has some suggestions?

@qnzhou
Copy link
Contributor

qnzhou commented Aug 26, 2022

Sorry, I don't understand the table -- it looks misleading. Currently nanobind throws an exception if you send it a RO buffer from python. So neither RO -> RO or RO -> RW from python -> C++ work.

Sorry, I should clarify the table is a summary of the data transfer technologies that (I think) can be used for transferring data under different read/write permissions. (and by python, I meant numpy array.) It is not what nanobind does currently.

In any case, I agree with you that we should combine nb::writable and nb::readonly. However, I don't fully understand your comment that "they really serve in different places". In my view, they are simply describing the read/write permission of the buffer associated with nb::tensor object regardless of whether it is from C++ to python or the other way around.

@wjakob
Copy link
Owner

wjakob commented Aug 30, 2022

Sorry for taking a long time to respond -- generally this seems like a reasonable set of changes, though it does seem somewhat specific to NumPy. A similar readonly flag is on the horizon for DLPack (data-apis/array-api#191), so whatever this turns into will have to generalize into passing all kinds of read-only tensors made using different frameworks.

A code style comment: in nanobind, opening braces are generally merged with the previous line -- the PR deviates from this convention.

@brettc
Copy link
Author

brettc commented Sep 8, 2022

Hi @wjakob. Thanks for the comments. I'll have a look at the DLPack changes, to try and see how they would fit.

@wjakob wjakob force-pushed the master branch 5 times, most recently from 1c17434 to a2b8b20 Compare October 14, 2022 15:25
@wjakob wjakob force-pushed the master branch 6 times, most recently from 42ce5e7 to 0106656 Compare October 27, 2022 13:05
@wjakob wjakob force-pushed the master branch 2 times, most recently from c41d953 to f935f93 Compare November 9, 2022 20:41
@wjakob wjakob force-pushed the master branch 4 times, most recently from 14dc172 to 16b77b1 Compare November 21, 2022 15:18
@wjakob wjakob force-pushed the master branch 3 times, most recently from d5b6905 to 9af019b Compare November 22, 2022 08:50
@wjakob wjakob force-pushed the master branch 7 times, most recently from 4280c39 to 50e761a Compare February 2, 2023 11:02
@wjakob wjakob force-pushed the master branch 5 times, most recently from ec3373c to b5ed696 Compare February 13, 2023 15:09
@wjakob
Copy link
Owner

wjakob commented Feb 15, 2023

I will close this PR, I don't think it is relevant anymore with recent changes to nanobind. I am happy to reopen it if you disagree.

@wjakob wjakob closed this Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants