Add offset_to_first_elem method for Array #994

jturner314 · 2021-05-09T04:03:21Z

This adds a method to get the offset from the start of the allocation to the first element. Without this method, correctly reinterpreting the results of .into_raw_vec() as an n-D array is tricky.

I came across a bug in the numpy crate (PyO3/rust-numpy#182) due to not handling this properly, so the the primary purpose of this PR is to help properly fix that bug.

The zero-sized-element case is a bit weird, because for many use cases, a return value of Some(0) would be fine. However, if we want the user to be able to compute in-bounds indices into the Vec using the offset, shape, and strides, we need to compute an offset that will work based on the shape and strides.

I've fixed/suppressed a couple of clippy warnings unrelated to offset_to_first_elem which have appeared with the new release of Rust.

By the way, is there a better name for this method?

bluss · 2021-05-09T12:03:18Z

src/impl_owned_array.rs

+            return None;
+        }
+        let offset = if std::mem::size_of::<A>() == 0 {
+            -dimension::offset_from_ptr_to_memory(&self.dim, &self.strides)


can't we use this dimension function in both cases?

No, because the shape and strides are not sufficient information. Consider, for example:

let a = Array2::<f64>::zeros([3, 4]).slice_move(s![1.., ..]); let b = Array2::<f64>::zeros([2, 4]); assert_eq!(a.shape(), b.shape()); assert_eq!(a.strides(), b.strides()); assert_ne!( unsafe { a.as_ptr().offset_from(a.into_raw_vec().as_ptr()) }, unsafe { b.as_ptr().offset_from(b.into_raw_vec().as_ptr()) }, );

a and b have the same shape and strides but different offsets.

The only reason for calling dimension::offset_from_ptr_to_memory here is to get an offset such that indices computed with offset + index[0]*strides[0] + index[1]*strides[1] + ... will be in-bounds for the result of .into_raw_vec(). We could return value return None or Some(0) for the zero-sized element case, but then it's more likely that the caller will have to explicitly deal with the zero-sized element case too.

Oh right. Thanks! That's the three "starts" of the array in memory again and I mixed up (1) and (2)...

start of memory allocation

lowest address of an element in the array

address of logically first element in the array..

I guess if we try and find good names for these (again). I wouldn't mind allocation head, memory head, logical head or something along those lines. Maybe "memory" is too vague to be useful(?)

I like "allocation ptr" and "logical ptr/offset". "Memory head" to refer to the lowest address of any element could be easily confused with the "allocation head". I think "low addr ptr/offset" is a little less likely to be confused with "allocation ptr".

I mixed up (1) and (2)...

Yeah, this makes me realize that we should really rename dimension::offset_from_ptr_to_memory to dimension::offset_from_logical_ptr_to_low_addr_ptr. (Actually, I'd prefer to have a dimension::offset_from_low_addr_ptr_to_logical_ptr function instead so that the return value is always nonnegative.)

I don't think "logical ptr" is very descriptive either. Offset >= 0 sounds good.

I don't mind "logical ptr", but we can think of another name. Maybe "zero idx ptr" / "zero index ptr"?

bluss · 2021-05-09T17:45:26Z

src/impl_owned_array.rs

+    /// let offset = arr.offset_to_first_elem().unwrap();
+    /// assert_eq!(offset, 0);
+    /// ```
+    pub fn offset_to_first_elem(&self) -> Option<usize> {


Maybe we can use the name logic "offset from memory to first elem" again? (Edit: So maybe offset from allocation to first elem?)

Yeah, offset_from_memory_to_ptr, offset_from_memory_to_first, offset_from_memory_to_first_elem, offset_from_alloc_to_ptr, offset_from_alloc_to_first, and offset_from_alloc_to_first_elem are clearer than offset_to_first_elem. Another, perhaps better, option is to change .into_raw_vec() to return (vec, offset) instead of adding a new method. This would make it harder to accidentally misuse .into_raw_vec().

Yes, if the offset is enough information, maybe that's good. Maybe we should just offer an easier to use interface? Something that copies to an in-order vec without trickery (but I think it already exists).

Now it doesn't seem to be happening quickly, but there will come a time when we have more widespread of custom allocation or non-Vec allocation. So into raw vec won't be useful forever, unless one restricts to only Vec-allocated arrays.

From the discussion in the other comment chain, I'm thinking that a good name would be offset_from_alloc_to_logical_ptr. How about making it an internal-only method for now and changing .into_raw_vec() to return (vec, offset) as I suggested above?

Edit: For some reason, GitHub is not automatically updating the discussion, so I missed the message above. It seems useful to be able to reuse the existing allocation if possible, like the numpy crate does, so I don't think that we should remove into_raw_vec entirely. Even with custom allocation, I'd think that we could return a Vec parameterized on the custom allocator.

You know, now that I think about it, I wonder if std will add a trait for Vec-like objects to minimize the proliferation of type parameters. The current ndarray API based on ArrayBase instead of traits can be inconvenient sometimes. Once custom (non-global) allocators are supported, Vec will be similar to ArrayBase, since it'll be generic over the allocator.

Makes sense.

I've long thought that we might parameterize the Array/ArrayBase not just by the Vec's allocator but by wholly new kinds of owned arrays - let's say some other allocation system, like always 64-byte aligned or what it could be, or maybe always using some external specific allocation function. In those cases we don't use Vec or std allocation traits at all, maybe.

Yeah, an ArrayVec-backed n-D array type would be useful too. Even with additional owned types/parameters, though, we could provide .into_raw_vec() for only the type parameters that make sense. Long-term, I'd like to move away from ArrayBase providing the majority of the functionality to a set of traits describing n-D array functionality, implemented for the various array types.

Fully agreed. It just seems to be a lot of work getting to the traitification. Maybe we can start it small and do it piece by piece, somehow?

Yeah, I think we can do it piece-by-piece. We can introduce traits one-by-one and implement them for ArrayBase. For example, we can start with an unsafe trait to get the ptr, dim, and strides, and then add a trait using it for readonly indexing/slicing (with methods returning element references and ArrayViews). I expect it'll take multiple iterations to fully refine the set of traits we need.

…al_ptr

bluss · 2021-05-09T19:53:01Z

While it's a breaking change, we wait with merging this until the 0.15.x releases are done

SparrowLii · 2021-05-12T09:39:34Z

src/dimension/mod.rs

-pub fn offset_from_ptr_to_memory<D: Dimension>(dim: &D, strides: &D) -> isize {
+/// This function computes the offset from the lowest address element to the
+/// logically first element. The result is always >= 0.
+pub fn offset_from_low_addr_ptr_to_logical_ptr<D: Dimension>(dim: &D, strides: &D) -> usize {
    let offset = izip!(dim.slice(), strides.slice()).fold(0, |_offset, (d, s)| {
        if (*s as isize) < 0 {


I am thinking about whether it would be better to change the condition to if (*s as isize) < 0 && *d != 0. In this way, we can allow the stride to be negative when the Axis value is 0 (theoretically, the stride can be any value at this time), and it can avoid pointer out of bounds and cause panic in the following case:

use ndarray::{Array, ShapeBuilder}; fn main() { let s = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]; let a= Array::from_shape_vec((2, 0, 2).strides((1, -4isize as usize, 2)),s[0..12].to_vec()).unwrap(); println!("{:?}", a); }

Output:

[[[]]], shape=[2, 0, 2], strides=[1, -4, 2], layout=Cc (0x5), const ndim=3

Good point, this seems like a bug that should be fixed in 0.15.2

Yeah, good catch, @SparrowLii. I've created #998 to fix this in 0.15.2.

jturner314 added 2 commits May 8, 2021 23:58

Add offset_to_first_elem method for owned arrays

135d555

Fix/suppress clippy warnings

275e83e

jturner314 mentioned this pull request May 9, 2021

IntoPyArray for Array is broken when the first element is not at the start of the allocation PyO3/rust-numpy#182

Closed

jturner314 changed the title ~~Add offset_to_first_elem method for owned arrays~~ Add offset_to_first_elem method for Array May 9, 2021

bluss reviewed May 9, 2021

View reviewed changes

jturner314 added 4 commits May 9, 2021 14:57

Change offset_from_ptr_to_memory to offset_from_low_addr_ptr_to_logic…

6812592

…al_ptr

Rename offset_to_first_elem to offset_from_alloc_to_logical_ptr

c33e03d

Add offset to return value of into_raw_vec

e496e63

Make offset_from_alloc_to_logical_ptr private

b8144e1

bluss added the breaking-change label May 9, 2021

bluss added this to the 0.16.0 milestone May 9, 2021

SparrowLii reviewed May 12, 2021

View reviewed changes

jturner314 mentioned this pull request May 13, 2021

Fix offset_from_ptr_to_memory for axis lengths of 0, and replace it with offset_from_low_addr_ptr_to_logical_ptr #998

Merged

kngwyu mentioned this pull request Jul 12, 2021

Fix IntoPyArray for arrays of which the starting pointer is not equal… PyO3/rust-numpy#194

Merged

bluss mentioned this pull request Apr 1, 2024

Add .into_raw_vec_with_offset() and deprecate .into_raw_vec() #1379

Merged

bluss closed this in #1379 Apr 4, 2024

lnicola mentioned this pull request Oct 10, 2024

upgrade ndarray to 0.16 georust/gdal#569

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add offset_to_first_elem method for Array #994

Add offset_to_first_elem method for Array #994

jturner314 commented May 9, 2021 •

edited

Loading

bluss May 9, 2021

jturner314 May 9, 2021

bluss May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

bluss May 12, 2021

jturner314 May 13, 2021

bluss May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

bluss May 9, 2021

jturner314 May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

bluss May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

bluss May 9, 2021

jturner314 May 9, 2021

bluss commented May 9, 2021

SparrowLii May 12, 2021 •

edited

Loading

bluss May 12, 2021 •

edited

Loading

jturner314 May 13, 2021 •

edited

Loading

Add offset_to_first_elem method for Array #994

Add offset_to_first_elem method for Array #994

Conversation

jturner314 commented May 9, 2021 • edited Loading

bluss May 9, 2021

Choose a reason for hiding this comment

jturner314 May 9, 2021

Choose a reason for hiding this comment

bluss May 9, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 May 9, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 May 9, 2021 • edited Loading

Choose a reason for hiding this comment

bluss May 12, 2021

Choose a reason for hiding this comment

jturner314 May 13, 2021

Choose a reason for hiding this comment

bluss May 9, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 May 9, 2021 • edited Loading

Choose a reason for hiding this comment

bluss May 9, 2021

Choose a reason for hiding this comment

jturner314 May 9, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 May 9, 2021 • edited Loading

Choose a reason for hiding this comment

bluss May 9, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 May 9, 2021 • edited Loading

Choose a reason for hiding this comment

bluss May 9, 2021

Choose a reason for hiding this comment

jturner314 May 9, 2021

Choose a reason for hiding this comment

bluss commented May 9, 2021

SparrowLii May 12, 2021 • edited Loading

Choose a reason for hiding this comment

bluss May 12, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 May 13, 2021 • edited Loading

Choose a reason for hiding this comment

jturner314 commented May 9, 2021 •

edited

Loading

bluss May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

bluss May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

bluss May 9, 2021 •

edited

Loading

jturner314 May 9, 2021 •

edited

Loading

SparrowLii May 12, 2021 •

edited

Loading

bluss May 12, 2021 •

edited

Loading

jturner314 May 13, 2021 •

edited

Loading