-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Vec::drain_filter #43245
Add Vec::drain_filter #43245
Conversation
r? @sfackler (rust_highfive has picked a reviewer for you, use r? to override) |
In case you're curious here's the other impl: https://gist.github.com/Gankro/ec6bf4f939b40ace16790447ed75152b It basically does a ton of work to try to use ptr::copy() on contiguous blocks of elements instead of swapping them individually. As a result it ends up having to handle boundary cases, and ZSTs all over the place. |
I think this is a good thing to include in the standard library. Even though out of tree implementation may be possible, I've found multiple times that |
I believe this is called |
Neither So the three fundamentally similar operations we have are drain, retain, and filter.
This gives us a few possibilities:
The most important point, to me, is that filter and retain have reversed "polarity" when applied to this problem:
This is precisely why I didn't like drain_by/drain_with; they left the predicate ambiguous on their own. Ultimately I felt that Using bare filter as a name didn't sit well with me; we don't add iterator-ish methods directly on collections. So ultimately drain_filter seemed like the clearest name. It also, hopefully, makes it easier to find. |
src/liballoc/vec.rs
Outdated
/// But `drain_filter` is easier to use. `drain_filter` is also more efficient, | ||
/// because it can backshift the elements of the array in bulk. | ||
/// | ||
/// Note that `drain_filter` also lets you mutate ever element in the filter closure, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every
.
This seems like a reasonable API to me. @rfcbot fcp merge |
Team member @sfackler has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
typo fixed |
Interestingly, |
In my experiments iteraterification had little to no performance impact, but I don't 100% trust my results. Note that drain_filter has a reversed predicate to retain_mut. |
/// let val = vec.remove(i); | ||
/// // your code here | ||
/// } | ||
/// i += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would skip every item after a removed item.
let some_predicate = |&x| x == 2;
let mut vec = vec![1, 2, 2, 3, 4, 5];
let mut i = 0;
while i != vec.len() {
if some_predicate(&mut vec[i]) {
let val = vec.remove(i);
println!("removed {}", val);
}
i += 1;
}
println!("{:?}", vec);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, haha! Need to put the +=1 in an else
Do we really need a whole new struct that implements Iterator for this? Would it not be easier to re-use most of the retain function, and call drain after we have done the moving of the elements?
The method above is very similar to the current retain method, and re-uses the existing Drain iterator instead of creating an entire new type. |
type Item = T; | ||
|
||
fn next(&mut self) -> Option<T> { | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would personally not make the unsafe
cover the entire block, just the raw_parts_mut
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also needed for the ptr::read. At that point might as well just say the whole thing is unsafe and call it a day (this matches many parts of this file).
@nrxus iterators are maximally flexible and composable, and should only be avoided when there's a performance or ergonomics reason. In this case I don't think there is. |
@gankro Drain is also an iterator though. What do we gain from a new type that implements iterator (FilterDrain) that outweights the benefit of reusing an existing working type that also implements iterator (Drain). |
Oh I see, I misread your original comment. I don't consider unifying DrainFilter and Drain to eliminate a struct to be intrinsically valuable to our users (and indeed, it would corner us into a very specific implementation). The maintenance burden is also fairly minimal for std. I am also concerned that this implementation has worse cache behaviour when most elements need to be removed, as it runs over the entire array an extra time to actually yield the elements. It also requires actually swapping the elements, as opposed to just memcopying one over the other. |
(I checked off @brson's box due to him being away) |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
One problem with all these closure-based design is that you cannot use |
The final comment period is now complete. |
Ping @sfackler? FCP completed without much concern besides #43245 (comment) (OP's reply at #43245 (comment)). |
@bors r+ |
📌 Commit 1af4226 has been approved by |
\o/ |
Add Vec::drain_filter This implements the API proposed in #43244. So I spent like half a day figuring out how to implement this in some awesome super-optimized unsafe way, which had me very confident this was worth putting into the stdlib. Then I looked at the impl for `retain`, and was like "oh dang". I compared the two and they basically ended up being the same speed. And the `retain` impl probably translates to DoubleEndedIter a lot more cleanly if we ever want that. So now I'm not totally confident this needs to go in the stdlib, but I've got two implementations and an amazingly robust test suite, so I figured I might as well toss it over the fence for discussion.
☀️ Test successful - status-appveyor, status-travis |
This implements the API proposed in #43244.
So I spent like half a day figuring out how to implement this in some awesome super-optimized unsafe way, which had me very confident this was worth putting into the stdlib.
Then I looked at the impl for
retain
, and was like "oh dang". I compared the two and they basically ended up being the same speed. And theretain
impl probably translates to DoubleEndedIter a lot more cleanly if we ever want that.So now I'm not totally confident this needs to go in the stdlib, but I've got two implementations and an amazingly robust test suite, so I figured I might as well toss it over the fence for discussion.