Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

append, append_row, append_column: methods for appending an array or single rows and columns #932

Merged
merged 30 commits into from
Apr 13, 2021

Conversation

bluss
Copy link
Member

@bluss bluss commented Mar 5, 2021

For owned arrays specifically (Array), allow appending rows and/or columns or
whole arrays (along a given axis).

New methods:

  • .append_row(ArrayView1<A>) -> Result
  • .append_column(ArrayView1<A>) -> Result
  • .append(axis, ArrayView<A, D>) -> Result
  • .move_into(impl Into<ArrayViewMut>)

New features:

  • stack, concatenate and .select(...) now support Clone elements
    (previously only Copy).

The axis we append along should be a "growing axis", i.e the axis with greatest,
and positive, stride. However if that axis is of length 0 or 1, we can always
convert it to the new growing axis.

These methods automatically move the whole array to a compatible memory
layout for appending, if needed.

The examples show that empty arrays are valid both ways (both memory
layouts) - and that might be the easiest way to use these methods for the users.

There were quite a few corner cases in the implementation in this PR, but hopefully
they should all be dealt with in a moderately clean way and with little duplication.

Fixes #269

@bluss
Copy link
Member Author

bluss commented Mar 5, 2021

Note to those following along: there already exists other ways to build arrays row-by-row or column-by-column: the stack and concat functions.

src/impl_owned_array.rs Outdated Show resolved Hide resolved
@jturner314
Copy link
Member

Functionality like this would be really nice. I think a slightly different API would be more generally useful:

impl<A, D> Array<A, D>
where
    D: Dimension,
{
    /// Appends the subview to the specified axis.
    ///
    /// The memory layout of the array will be changed if necessary (by copying the data)
    /// so that subsequent calls to `.append_to_axis()` with the same axis will be fast.
    /// This occurs in at least the following cases:
    ///
    /// - If the outermost (i.e. largest absolute stride) axis of the array is not `axis`.
    /// - If the array is discontiguous.
    /// - If the stride for `axis` is negative.
    ///
    /// So, calling `.append_to_axis()` with varying values of `axis` will likely be very slow, but
    /// after one call to `.append_to_axis()`, subsequent calls to `.append_to_axis()` with the
    /// *same* axis will be relatively fast.
    ///
    /// **Errors** if the axis lengths of `subview` do not match the correponding axis lengths of `self`.
    pub fn append_to_axis<S2>(&mut self, axis: Axis, subview: &ArrayBase<S2, D::Smaller>) -> Result<(), ShapeError>
    where
        A: Clone,
        S2: Data<Elem = A>,
    {}
}

In other words, it would be nice for the operation to work for n-dimensional arrays of arbitrary layout. The layout of the array would be changed as necessary so that the specified axis would be the outermost axis, so that repeated calls to .append_to_axis() with the same axis would be relatively fast. The remaining question with this design is how to handle the layout of the other axes. We could decide to (1) preserve the existing order, (2) use the order in the subview being appended, or (3) something in-between. I'd suggest option (2), because I think in most cases .append_to_axis() will be called repeatedly with the same axis and subviews of the same layout, and because it would simplify the handling of drops if the .clone() of one of the elements in subview panics (since the new elements would be added in order, like in this PR).

@bluss
Copy link
Member Author

bluss commented Mar 6, 2021

Yeah that would be more useful, but the complexity is higher. Do we have an easy way to get to the implementation?

How to do the trick where we make axis the largest stride axis? If this sketched idea is enough then maybe it's not too complex to implement:

a.swap_axis(0, axis);
let a = a.as_standard_layout().into_owned();
a.swap_axis(0, axis);
// axis is now the largest stride axis; and with that axis ignored, the rest is c-contig

@jturner314
Copy link
Member

jturner314 commented Mar 7, 2021

I suppose I'd suggest something like the following:

impl<A, D> Array<A, D>
where
    D: Dimension,
{
    /// Rearranges the layout if necessary (by copying the data) such that:
    ///
    /// - if the specified axis has length > 1, it is the outermost axis and
    ///   has a positive stride
    /// - the other axes are in standard layout
    /// - the array is contiguous
    ///
    /// # Example
    ///
    /// ```rust
    /// use ndarray::{Array4, Axis};
    ///
    /// let mut arr = Array4::<u8>::zeros([2, 3, 4, 5]);
    /// assert_eq!(arr.shape(), &[2, 3, 4, 5]);
    /// assert_eq!(arr.strides(), &[60, 20, 5, 1]);
    ///
    /// arr.make_axis_outer_pos(Axis(2));
    /// assert_eq!(arr.shape(), &[2, 3, 4, 5]);
    /// assert_eq!(arr.strides(), &[15, 5, 30, 1]);
    /// ```
    pub(crate) fn make_axis_outer_pos(&mut self, axis: Axis)
    where
        A: Clone,
        D: RemoveAxis,
    {
        if self.is_contiguous()
            && is_standard_layout(&self.dim.remove_axis(axis), &self.strides.remove_axis(axis))
            && (self.len_of(axis) <= 1
                || axis == self.dim.max_stride_axis(&self.strides) && self.stride_of(axis) > 0)
        {
            return;
        }
        let ax = axis.index();
        self.dim.slice_mut()[..=ax].rotate_right(1);
        self.strides.slice_mut()[..=ax].rotate_right(1);
        *self = Array::from_shape_vec(self.dim.clone(), self.iter().cloned().collect()).unwrap();
        self.dim.slice_mut()[..=ax].rotate_left(1);
        self.strides.slice_mut()[..=ax].rotate_left(1);
    }
}

(The existing is_standard_layout function would need to be moved out of the .is_standard_layout() method and into e.g. dimension/mod.rs.)

In principle, we could move the elements instead of cloning them, but that would require some unsafe code and would be much more complex to implement (because all the elements in the data which aren't moved would need to be dropped).

This is basically the same approach as what you wrote above, but uses a rotate instead of a swap so that the other axes are in standard layout. This would make it easy to append the subview since the layout would match the order of subview.iter(). Note that if axis has length ≤ 1, we need to be careful to make sure the strides are correct after the subview is appended (since the stride of a length ≤ 1 axis can be anything, and if there's a length 0 axis, then the strides of the other axes can be 0).

@bluss
Copy link
Member Author

bluss commented Mar 31, 2021

I've looked at the append array case, but I'm stuck at the usual, place, handling non-copy elements. I'm stubbornly preferring to not forget/leak values in case of panic, I don't think it's the right thing to do for a Rust data structure. (Stack/concat also have this problem, while map_collect has overcome it 🙂).

@jturner314
Copy link
Member

jturner314 commented Apr 1, 2021

I'd suggest something like the following:

  1. Check that subview has the correct shape and that adding 1 to the length of the axis won't violate the overflow-related constraints of ArrayBase.
  2. Call self.make_axis_outer_pos(axis) to adjust the layout of self. [Edit: As @bluss noticed in his PR, this is not quite sufficient. We also need to ensure that there are no extra elements before/after the array in the underlying OwnedRepr.]
  3. (Debug) assert that self.data.ptr == self.ptr. (make_axis_outer_pos should have ensured that this is the case.)
  4. Call let v = self.data.take_as_vec(); to convert the OwnedRepr to a Vec.
  5. Call v.extend(subview.iter().cloned()).
  6. Convert v back into an OwnedRepr and assign it to self.data.
  7. Update self.ptr to point at the new data, in case v.extend reallocated the Vec.
  8. Update self.shape by adding 1 to the length of the axis.
  9. If the original axis length was 1, then update the stride of the axis to be correct, since the stride of an axis with length 1 could be almost anything. (This step could be performed here, or it could be handled with a modification to make_axis_outer_pos.)

This way, the tricky stuff is handled by Vec::extend, which AFAIK handles panics without leaks. I'm curious how you were thinking about implementing it, and where the leak could occur.

@bluss
Copy link
Member Author

bluss commented Apr 1, 2021

Hm, my implementation uses Zip to make sure shape is traversed in the right order. To use Vec::extend we need to guarantee the array is C-contig except for the axis we are extending, which is a further requirement (or, at least, whatever order subview.iter() or similar would be using). So I started a bit different. I'd ideally see that we don't depend on Vec at all - looking forward to future custom allocation, but that concern is just a minor thing in the details here.

The tricky stuff is our nd shape 🙂

@jturner314
Copy link
Member

jturner314 commented Apr 1, 2021

To use Vec::extend we need to guarantee the array is C-contig except for the axis we are extending

The make_axis_outer_pos implementation from my earlier comment does this (cloning the data if necessary), although I see now that the name doesn't indicate that. Maybe a better name would be make_axis_outer_standard, or the body should just be placed in append_to_axis instead of having a separate function. An implementation based on make_axis_outer_pos isn't necessarily the fastest implementation, but it's fairly simple.

I guess you were thinking of preserving the original layout of the array as long as the axis being appended to is the outermost one with a positive stride? That would be possible, although as you pointed out, the implementation would be tricky due to iteration order / leaking issues. It would be nice to do something like that in the future, but I don't think it's necessary for an initial implementation.

I'd ideally see that we don't depend on Vec at all - looking forward to future custom allocation, but that concern is just a minor thing in the details here.

I'm curious about this. I'd think we'd want to write an allocator which implements the std::alloc::Allocator trait (which will hopefully be stabilized at some point), and Vec has an A: Allocator type parameter, so it would work with our custom allocator. Or, are you thinking about creating an ndarray-specific allocator before std::alloc::Allocator is stabilized?

@bluss
Copy link
Member Author

bluss commented Apr 1, 2021

I just think I'm following my own implementation, not writing down your ideas here :) So that's the reason. I was standing still at that problem in my implementation, not using your implementation idea from your comment. Just to clarify.

I was thinking of more ndarray specific allocation needs, like aligned allocation or possibly the idea to be compatible with the arrow format. The Allocator trait is kind of not really even solving that problem, I think? Since we could want to tweak allocation alignment and not which allocator it calls, for example.

@jturner314
Copy link
Member

jturner314 commented Apr 1, 2021

Oh, okay, I understand.

Regarding the allocator stuff, I was thinking of something like this:

pub struct MinAlignAllocator<A: Allocator, const MIN_ALIGN: usize>(A);

impl<A: Allocator, const MIN_ALIGN: usize> Allocator for MinAlignAllocator {
    fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
        self.0.allocate(layout.align_to(MIN_ALIGN).unwrap_or(AllocError)?)
    }

    fn deallocate(&self, ptr: NonNull<u8>, layout: Layout) {
        self.0.deallocate(ptr, layout.align_to(MIN_ALIGN).unwrap())
    }
}

We'd add an L: Allocator parameter to ArrayBase. We could use a default of L = Global and always wrap the allocator in MinAlignAllocator internally (if we always want a minimum alignment), or we could use a default of L = MinAlignAllocator<Global, 64> and directly use L internally (if we want to allow allocations without the minimum alignment). This would be compatible with zero-cost conversions to/from Vec<T, MinAlignAllocator<A, 64>>.

[Edit: Another option for the API if we always want to force a minimum alignment would be to add an unsafe MinAlignAlloc trait which would mean "allocations from this allocator have a minimum alignment", and then use a L: Allocator + MinAlignAlloc<64> parameter for ArrayBase. We'd also provide the MinAlignAllocator struct, which would implement Allocator and MinAlignAlloc, and use it as the default allocator for arrays. This trait-based option would be a little better if the Arrow crate adds its own allocator like MinAlignAllocator, since it could implement the MinAlignAlloc trait.]

A custom allocator like this is the cleanest way I see to handle the minimum-alignment issue, since the GlobalAlloc/Allocator traits require using the same alignment in alloc/allocate and dealloc/deallocate. So, there currently isn't a good way to get a minimum alignment for array allocations while also supporting zero-cost conversions to Vec (since the Vec needs to know the alignment of the original allocation so that it can be correctly dropped, but Vec<T> just assumes that the alignment of the allocation is the alignment of T).

In current Rust, I suppose we could change OwnedRepr to use GlobalAlloc directly and force a minimum alignment, since we have complete control over allocation/deallocation of OwnedRepr, but that would mean that we couldn't have a zero-cost conversion from OwnedRepr to Vec.

@bluss bluss changed the title Add try_append_row and try_append_column Add try_append_array; methods for appending an array or single rows and columns Apr 3, 2021
@bluss bluss force-pushed the append-row-column branch 2 times, most recently from 0019cd4 to 7b97f84 Compare April 3, 2021 20:58
@bluss
Copy link
Member Author

bluss commented Apr 3, 2021

I solved my predicament with the "memory fill order" with Zip. Using a technique I think we have discussed before but I don't recall that it's in the source anywhere: (I think it's cool 🙂) (Edit: Maybe we did not. I realized it's something I'm using in my under cover by value/IntoIterator implementation that's not yet finished.)

To recover the desired traversal order for array views memory and b:

  1. Given Zip::from(memory).and(b)
  2. Apply axis inversions to memory to give it all-positive stride axes, and invert the same axes in b
  3. Apply axis order swaps on memory to give it standard memory layout and swap the same axes in b

Thanks @jturner314 for the reminder to fix up strides. I didn't quite realize how involved it might be.

@bluss bluss added this to the 0.15.1 milestone Apr 3, 2021
@bluss bluss modified the milestones: 0.15.1, 0.15.2 Apr 3, 2021
@bluss bluss force-pushed the append-row-column branch 2 times, most recently from 32fc3a3 to dd353b8 Compare April 3, 2021 22:32
@bluss

This comment has been minimized.

@bluss bluss changed the title Add try_append_array; methods for appending an array or single rows and columns (WIP) Add try_append_array; methods for appending an array or single rows and columns Apr 6, 2021
@bluss bluss changed the title (WIP) Add try_append_array; methods for appending an array or single rows and columns append, append_row, append_column: methods for appending an array or single rows and columns Apr 7, 2021
@bluss

This comment has been minimized.

@bluss
Copy link
Member Author

bluss commented Apr 13, 2021

Thanks a lot for the comments and bugfixes 🙂.

I don't know about append versus push, it's quite arbitrary to me. Append is not the wrong word for any of that, it's just about Rust conventions I think.

I looked into std collections and they are consistent with append always being something like Vec::append(&mut self, &mut Vec<_>); where append moves ownership from another &mut Self into the receiver.

My only thought is that it's not inconceivable to generalize our .append(arrayview) to take either an arrayview (and Clone) or an Array (and move elements), but it's a bit much to do as an extra project inside this PR.

@bluss bluss force-pushed the append-row-column branch 2 times, most recently from 67afa5b to 470b26a Compare April 13, 2021 19:34
bluss and others added 13 commits April 13, 2021 22:01
Now without the tricky memory layout failure mode, these methods can
just have names without try_ prefix. It's expected that they work
successfully as long as the user matches up array sizes correctly.
This method showed up in profiling because it wasn't inlinable.
This speeds up the benches/append.rs benchmarks by 5-10%.
The 1D case is a simpler gather operation since we only select 1 element
per index. Special-case it.

The performance increase for this benchmark is that the benchmark
runtime changes by -92% with this change.
If the innermost dimension is contiguous, we can skip it in one go,
and save some work in the dropping loop in move_into.

Co-authored-by: Jim Turner <[email protected]>
Benchmarks (append benchmarks) show that this is a worthwhile
optimization, and that plain self.data.set_len() in the loop performs
worse.
Imagine that the array-to-append has length 1 along the growing axis.
With this memory layout, the resulting assignment when copying the array
elements into `self`, can often be a contiguous traversal.

Co-authored-by: Jim Turner <[email protected]>
Fix two bugs that Jim found in how we calculate the new stride for the
growing axis.
Tests by Jim Turner.

Co-authored-by: Jim Turner <[email protected]>
This .max_by_key() call showed up in profiling, and this change improves
the select_axis0 benchmark by reducing runtime by 15%
@bluss
Copy link
Member Author

bluss commented Apr 13, 2021

Merging to finish this work. Renaming can still be done in follow-ups.

Naming suggestions welcome - append or extend_with_array? append_row or push_row? etc

@bluss bluss merged commit 77c6286 into master Apr 13, 2021
@bluss bluss deleted the append-row-column branch April 13, 2021 21:28
@gussmith23
Copy link

this is awesome! really helpful to have clone supported on these methods

@bluss
Copy link
Member Author

bluss commented Apr 13, 2021

I'm glad you share my enthusiasm. 🙂 I'm happy to finally have arrays that can be appended to.

@jturner314
Copy link
Member

jturner314 commented Apr 16, 2021

I'm really looking forward to this functionality too. 🎉 There are a number of places where it'll simplify my code.

To clarify my comment about the naming question -- I think that if we name the row/column methods and the general methods similarly, then the general method should also take D::Smaller for consistency:

// On 2-D arrays
fn append_row(&mut self, row: ArrayView<A, D::Smaller>) {}
fn append_column(&mut self, column: ArrayView<A, D::Smaller>) {}

// On n-D arrays
fn the_general_method(&mut self, axis: Axis, array: ArrayView<A, D::Smaller>) {}

I agree that it's helpful to name the methods similarly to Vec's methods. IMO, it makes sense for the axis being appended to be thought of analogously to Vec. Conceptually, you could think of appending to axis 1 of an Array3 of shape [2, 3, 4] as being analogous to appending to a Vec<Array2> where the Vec has length 3 and the Array2s have shape [2, 4]. With this analogy, adding a single subview (of dimension D::Smaller) to the array is like adding a single element to the Vec with push, while adding a view of dimension D (which is similar to &[ArrayView<A, D::Smaller>]) is like adding a sequence of elements to a Vec with append/extend_from_slice.

So, I'd actually suggest four methods:

// On 2-D arrays
fn push_row(&mut self, row: ArrayView<A, D::Smaller>) {}
fn push_column(&mut self, column: ArrayView<A, D::Smaller>) {}

// On n-D arrays
fn push(&mut self, axis: Axis, array: ArrayView<A, D::Smaller>) {}
fn append(&mut self, axis: Axis, array: ArrayView<A, D>) {} // or "extend_from_view"

The two separate methods on n-D arrays are both useful. push is useful for building arrays by stacking subviews one-by-one, while append is useful for building arrays from larger chunks. For my personal use, push is actually more useful than append, because I'm usually performing some calculations in a loop and want to stack the results into an array along a new dimension corresponding to the loop.

The primary way in which this analogy isn't quite right is that ndarray's methods take views, while Vec's push and append methods take owned values, but I think that's less important with respect to naming than the number of dimensions. If we add support for moving owned values in the future, we could either modify the existing methods to take an argument with S2: Data, A: Clone (like into_owned does), or we could rename the existing methods which take views to something like push_view/push_cloned/push_from_view/push_view_cloned/etc.

@bluss
Copy link
Member Author

bluss commented Apr 16, 2021

That sounds good. I think you have convinced me. (however, I don't see the push_clone/push_view etc names being needed, and I hope not they will be.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Clone elements in stack and select
3 participants