-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement read-only memoryviews #1869
Conversation
c230e58
to
1800cbd
Compare
Nice to see some progress on this since this would be very nice to have for scikit-learn (and I guess pandas has similar problems too)! I quickly check out this PR and it does seem to fix the problem indeed. I am not a cython expert unfortunately. I would be keen to help out on this one but I am not sure how useful I can be ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is looking great from the perspective of what's currently tested. What more do you feel needs to be tested?
By compile time checks I assume you mean that the compilation should error if a const memoryview is mutated...? |
From scikit-learn perspective, I don't think we care that much about compile-time checks, would you agree @jnothman? The crucial thing we need is to be able to create a typed memoryview from a read-only buffer. |
I suppose you mean that a basic implementation with the right interface
would be a very welcome interim solution
|
There aren't really any dedicated tests for the whole feature. I'm sure there are various corner cases and likely also cases where it misdetects usage patterns etc., e.g. when passing memory views around across functions. If someone who wants this feature and thus knows how to make use of it could write more targeted tests for this, it would become much clearer in what state this feature is and what is left to do to get it released. If the answer is "nothing", then that's perfect and I'll click the merge button, but I doubt it.
Yes. That also needs test code. |
I suppose one option might be trying to make use of this in a project (e.g. scikit-learn) in a pull request and seeing what can't be expressed and compiled... Sounds like lots of work though :\ |
Sounds like lots of work though :\
Well, it does seem to me that there was a certain user interest in this feature. Releasing a change that likely breaks user code is a risk for us maintainers, because it means that we'll probably have to respond to bug reports quickly once it's out, and might have to disable things that others have already started relying on, or try to improve them quickly, now that people have finally been forced to at least test them on their code...
Anyway, given that 0.28 already contains so many major new features that one more doesn't really make it much worse, and given that at least one person seems to have tried this without reporting any breakage, I'll consider taking that additional risk as well.
|
Can I please confirm that this does not introduce any syntax, merely allows a read-only memoryview until writing is required? |
For the record I tried to compile scikit-learn using this branch and I get a similar error as the ones you can currently see on Travis (all the Travis builds seem to be failing currently).
scikit-learn compilation error
|
Thanks for working on this @scoder , it is very much appreciated. I have tried to make some self contained tests in rth/cython-mmview-ro. So far I'm testing that,
I'm not sure if there are other corner cases which would be worth testing? BTW, I'm also getting a few warnings at compilation time,
similarly to what was reported in #1985 (comment). That's not related to this PR though, I also get those on master... |
@scoder mentioned passing between functions...
|
The "obvious" issue with passing memoryviews through functions is that one function might acquire a read-only view, pass it into another function, and that function might try to write to it. That should a) not fail and b) probably acquire a writable view for now. |
I merged the latest master into the branch to fix a recent (unrelated) regression.
That's actually not how it should work. It's ok to pass a writable view into a function that requires a read-only view, but not the other way round. I stole some of your tests and created a dedicated test suite file from them. Seems to work for me. Let's see what other users find when they bump into this feature. |
Thanks for the explanations!
So I tried that here i.e. an outer function that passes the memory view to the inner function that then tries to modify it. When the memoryview is ro,
Right, makes sense. |
I'll let travis give it a run and it it's happy (enough), I'll merge it in its current state. It seems good enough to hand it to our users. |
It looks like there is just one docstring failing in Travis. Another good news is that it looks like at least one build of pandas built sucessfully with Cython from this PR (the other are still running at the moment of writing) The not so good one is that scikit-learn gets a compilation error with Cython from this PR (it does pass with Cython from the master branch). I should be able to investigate and isolate the issue tomorrow (to add it to the test suite), but it looks like it might have something to do with memoryviews with fused types passed between functions... I'm not sure how you would prefer to handle this @scoder and whether it would have an impact on this PR or if I should open a separate issue? Thanks! |
Yeah ... turns out that travis is not happy. In fact, this is enough to confuse the auto-readonly detection: def writable(int[:] mslice):
new_var = mslice
new_var[2] = 23 This fails because it requests a read-only view, which is then not assignable through Python's item setting. Thus, I cannot seriously consider this part of the feature ready. It needs more work. But at least the I'll disable the auto-readonly detection, then you can give it another try in scikit-learn to see if that fixes it. |
So to recapitulate, to check that I understood correctly, the example from #1605 (comment) would fail with a ro input array, unless the function was defined with @lesteve @jnothman For joblib related parallelism in scikit-learn, specifying ro memoryviews manually with |
Yes, that is correct. And yes, that is a major improvement. |
…uire them only as read-only if the memory view dtype is a "const" type.
…s, but keep some of the tests for 'const' views as smoke tests.
I disabled the auto-detection and fixed a couple of further issues. The "usual suspect" tests pass, but travis has some serious problems today, so I'll take a look at it tomorrow. @rth: would be nice if you could already retest on your side, now that the feature is in a safer state. Note that simply changing the memory view declaration to |
Ok, travis likes it and the latest pandas build also seems to have succeeded. Let's get it in. |
very exciting! Thanks!
|
I also think I found a fix for the scikit-learn compilation failure. Please try with the master branch from now on. |
scikit-learn also builds sucessfully with the current master branch, so everything looks great! Thanks for making this happen! |
Great stuff, thanks a lot @scoder! I played with this a bit, and I get the impression that Another thing I noticed: I was not able to find a way to create a writable memoryview from a readable one (using cython master): import numpy as np
cdef const double[:] ro = np.array([1., 2.])
cdef double[:] rw = np.empty_like(ro)
rw[:] = ro The error I get is: |
Thanks for the reproducer, fixed here: 6704d23 |
No idea what In any case, whatever 0.27 did is not going to change any more in that release series. |
This seems to work fine on cython 0.27: cpdef func(const double[:] arr):
# of course the const does not do anything for cython < 0.28 so you can change the array
arr[0] = -999
import numpy as np
arr = np.ones(3)
func(arr) Just trying to clarify, scikit-learn requires cython >= 0.23 at the moment. I am trying to investigate whether it is possible for scikit-learn to have a code compatible for cython < 0.28 and cython >= 0.28. |
what do we lose by requiring cython>=28?
…On 19 Feb 2018 9:05 pm, "Loïc Estève" ***@***.***> wrote:
This seems to work fine on cython 0.27:
cpdef func(const double[:] arr):
# of course the const does not do anything for cython < 0.28 so you can change the array
arr[0] = -999
import numpy as np
arr = np.ones(3)
func(arr)
Just trying to clarify, scikit-learn requires cython >= 0.23 at the
moment. I am trying to investigate whether it is possible for scikit-learn
to have a code compatible for cython < 0.28 and cython >= 0.28.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1869 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAEz6zoHoh7TmZnBtawcyV4Wbd4nYTcuks5tWUd9gaJpZM4PUNgq>
.
|
We could do that but we will have to wait until cython 0.28 is released first. If there is an easy way to support both cython > 0.27 and cython <= 0.27, this may make it easier to use the "const typed memoryview" feature incrementally in scikit-learn, experiment with this feature and possibly iron out a few quirks before cython 0.28 is released. |
The generated C code on 0.27 with and without 1542c1542
< static PyObject *__pyx_f_6mmview_func(__Pyx_memviewslice const , int __pyx_skip_dispatch); /*proto*/
---
> static PyObject *__pyx_f_6mmview_func(__Pyx_memviewslice, int __pyx_skip_dispatch); /*proto*/
1842c1842
< static PyObject *__pyx_f_6mmview_func(__Pyx_memviewslice const __pyx_v_arr, CYTHON_UNUSED int __pyx_skip_dispatch) {
---
> static PyObject *__pyx_f_6mmview_func(__Pyx_memviewslice __pyx_v_arr, CYTHON_UNUSED int __pyx_skip_dispatch) { but I imagine as long as it works and produces the expected result, it might be OK.. |
See issue #1605.
Needs more testing and probably also more compile time checks.