Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement drain RFC #23071

Closed
wants to merge 4 commits into from
Closed

Implement drain RFC #23071

wants to merge 4 commits into from

Conversation

mahkoh
Copy link
Contributor

@mahkoh mahkoh commented Mar 5, 2015

Closes #23055

[breaking-change]

r? @gankro

Closes #23055

[breaking-change]
@Gankra
Copy link
Contributor

Gankra commented Mar 5, 2015

Oh woah the RFC landed.

@@ -553,8 +553,8 @@ impl<T: Ord> BinaryHeap<T> {
#[inline]
#[unstable(feature = "collections",
reason = "matches collection reform specification, waiting for dust to settle")]
pub fn drain(&mut self) -> Drain<T> {
Drain { iter: self.data.drain() }
pub fn drain<'a>(&'a mut self) -> Drain<'a, T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the lifetimes can be elided here?

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 6, 2015

Updated.

left: *const T,
right: *const T,
marker1: PhantomData<&'a ()>,
marker2: PhantomData<T>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were going to write Drain in pure safe code, it would look something like:

pub struct Drain<'a, T:'a> {
    start: uint,
    len: uint,
    victim: &'a mut Vec<T>
}

then on each call to next I could call remove on the item at self.start and decrement self.len. When dropping I would keep calling next self.len times. This should be roughly equivalent to the current behavior, though the modifications to the vector happen at different times, but I believe that is unobservable.

Therefore, I believe the marker you want is PhantomData<&'a mut Vec<T>>.

It is reasonable to want to put PhantomData<T>, because the destructor will drop instances of T, but not necessary -- the dropck will already be imposing conservative rules on Drain because it contains borrowed data. Basically, if you model what you would have written in safe code, you should be ok (unless the language itself is flawed).

cc @pnkfelix

UPDATE: Fixed a reference to &'a Vec<T> that should have been &'a mut Vec<T>

@nikomatsakis
Copy link
Contributor

To summarize a length discussion on IRC about markers:

  1. What I wrote here is definitely the most straight-forward annotation: Implement drain RFC #23071 (comment)
  2. However, it would lead to invariance, which given the public API of Drain is stricter than necessary.
  3. &'a T would induce covariance and suggests the correct relationship between 'a and T
  4. However, @mahkoh argues that in principle some sufficiently smart compiler might consider the substituted form, and hence it would see &'a &mut U, and thus consider that since &mut is affine and you can't move from a & reference, it is safe to approximate U (be covariant w/r/t U). This is true as far as it goes (it's not what our compiler does, and I'm not sure that it would ever be a reasonable thing to do). For this reason, he argues for &'a () (to capture the lifetime) and fn() -> T (to capture the covariance). I don't know that this distinction is so important, but I agree that this would be a sound formulation, which is my primary concern here.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 6, 2015

Updated.

@nikomatsakis
Copy link
Contributor

ps this API is pretty nifty.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 6, 2015

Updated.

@Gankra
Copy link
Contributor

Gankra commented Mar 6, 2015

I believe this doesn't closr the rfc issue proper. At least VecDeque is outstanding.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 6, 2015

VecDeque was not part of the RFC. It talked only about Vec and String.

@aturon
Copy link
Member

aturon commented Mar 6, 2015

cc me

@aturon
Copy link
Member

aturon commented Mar 6, 2015

@mahkoh @gankro

VecDeque was not part of the RFC. It talked only about Vec and String.

I think there was some oversight in terms of the full force of the RFC. Although the RFC was explicitly only talking about String and Vec, I had assumed that the drain convention would apply to any collection for which ranges make sense, at least. For things like HashMap, I expect we won't have a range parameter.

If needed, we can file an amendment to the RFC to clarify this point.

@aturon
Copy link
Member

aturon commented Mar 6, 2015

Also, the RFC left unspecified the exact trait used for capturing the various range types (which was a point of frustration, since it's much better to specify these details up front), but I personally would strongly prefer to introduce a general trait that covers the various ranges over a given type T. This is needed in a few other places as well.

cc @alexcrichton

@Gankra
Copy link
Contributor

Gankra commented Mar 6, 2015

By contrast I'd rather favour drain be a trait that the collection implements for particular inputs. This allows the iterator to be specialized for the different kind of range for performance reasons as necessary (e.g. FullRange or usize can be quite simpler).

However I think this needs HKT.

@Gankra
Copy link
Contributor

Gankra commented Mar 6, 2015

For reference, I picture:

trait Drain<Input> {
  type Drain<'*>; // HKT
  fn drain<'a>(&'a mut self, input: Input) -> Drain<'a>;
}

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 7, 2015

What's stopping this PR from being merged right now? A new RFC can get a new PR and for the most common use case, calling drain with an explicitly typed range parameter, all changes will be backwards compatible.

@liigo
Copy link
Contributor

liigo commented Mar 7, 2015 via email

@alexcrichton
Copy link
Member

I personally would strongly prefer to introduce a general trait that covers the various ranges over a given type T

I agree, but this is all #[unstable] for now so it's fine by me to land this in the interim. We definitely don't want to go too crazy in adding 4 of these, however.

@pnkfelix
Copy link
Member

pnkfelix commented Mar 9, 2015

@liigo I think the Vec is borrowed for the lifetime.of the drain, so one cannot push onto it or otherwise cause a realloc , and it looks like the drain drop implementation will clean up its part if we hit a panic.

Still, a compile-fail test for the former and a run-pass test for the latter seem like good exercises

@bluss
Copy link
Member

bluss commented Mar 9, 2015

@gankro, you don't need HKT, trait Drain<'a, Input> { type Drain; fn drain(&'a self, Input) -> Self::Drain } works today. Bound is T: for<'a> Drain<'a, Range<K>>. Note that implementor decides if the Drain type needs to use the lifetime or not though..

I also think it is silly to implement Drain for other specifiers than ranges. Skip single indices and leave that to other plain remove methods. It's not good API design to combine different functionality. Of course constructing a range that spans just a single key will often be possible.

@mahkoh
Copy link
Contributor Author

mahkoh commented Mar 9, 2015

I also think it is silly to implement Drain for other specifiers than ranges. Skip single indices and leave that to other plain remove methods. It's not good API design to combine different functionality.

Should have voiced your concerns before the RFC was accepted.

@nikomatsakis
Copy link
Contributor

It's not clear to me how much this design discussion concerns this PR vs the overall end point of the design. Seems like a thread on internals would be a better venue for discussing the latter.

@mahkoh mahkoh closed this Mar 10, 2015
bors added a commit that referenced this pull request Apr 28, 2015
Implement Vec::drain(\<range type\>) from rust-lang/rfcs#574, tracking issue #23055.

This is a big step forward for vector usability. This is an introduction of an API for removing a range of *m* consecutive elements from a vector, as efficently as possible.

New features:

- Introduce trait `std::collections::range::RangeArgument` implemented by all four built-in range types.
- Change `Vec::drain()` to use `Vec::drain<R: RangeArgument>(R)`

Implementation notes:

- Use @gankro's idea for memory safety: Use `set_len` on the source vector when creating the iterator, to make sure that the part of the vector that will be modified is unreachable. Fix up things in Drain's destructor — but even if it doesn't run, we don't expose any moved-out-from slots of the vector.
- This `.drain<R>(R)` very close to how it is specified in the RFC.
- Introduced as unstable
- Drain reuses the slice iterator — copying and pasting the same iterator pointer arithmetic again felt very bad
- The `usize` index as a range argument in the RFC is not included. The ranges trait would have to change to accomodate it.

Please help me with:

- Name and location of the new ranges trait.
- Design of the ranges trait
- Understanding Niko's comments about variance (Note: for a long time I was using a straight up &mut Vec in the iterator, but I changed this to permit reusing the slice iterator).

Previous PR and discussion: #23071
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking issue for Replace Vec::drain by a method that accepts a range parameter (RFC 574)
9 participants