Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iterator: Add StepBy::size_hint method #24720

Merged
merged 1 commit into from
May 1, 2015
Merged

iterator: Add StepBy::size_hint method #24720

merged 1 commit into from
May 1, 2015

Conversation

critiqjo
Copy link

Iterator::size_hint can be easily implemented for StepBy.
#23708

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see CONTRIBUTING.md for more information.

@critiqjo
Copy link
Author

Sorry, there is this bug:

#![feature(step_by)]
fn main () {
    let mut range = (0..5).step_by(2);

    loop {
        println!("{:?}", range.size_hint());
        match range.next() {
            Some(x) => {
                println!("{}", x);
            },
            None => { break }
        }
    }
}

gives

(2, Some(2))
0
(1, Some(1))
2
(0, Some(0))
4
(0, Some(0))

But is it because Range is end-exclusive, or is it because Step::steps_between is incorrect?

@alexcrichton
Copy link
Member

@critiqjo yeah the range iterators are inclusive on the left but exclusive on the right. Could you add some tests for this change as well? (after fixing the underlying bug)

@critiqjo
Copy link
Author

I'm sorry about editing the original post to add new content (I assume you replied to the mail). So I'll post the edits here as new comment:

src/libcore/iter.rs (same file) lines 2424-2430:

             fn steps_between(start: &$t, end: &$t, by: &$t) -> Option<usize> {
                 if *start <= *end {
                     Some(((*end - *start) / *by) as usize)
                 } else {
                     Some(0)
                 }
             }

Shouldn't it be (*end - *start) / *by + 1 when there is a non-zero reminder?

The following code gives Some(2) but should be Some(3) right?

#![feature(step_trait)]
use std::iter::Step;

fn main () {
    let steps = Step::steps_between(&0, &5, &2);
    println!("{:?}", steps);
}

@critiqjo
Copy link
Author

If steps_between is excluding the starting step (0), then it makes sense... Number of steps in between start and end (excluding start and including end), then steps_between is indeed correct...

@critiqjo
Copy link
Author

I'll fix the bug... But I have never written tests. Is it correct to add my test to fn test_range_step() of this file: https://github.com/rust-lang/rust/blob/master/src/libcoretest/iter.rs#L776 ? Or should I create a new test function?

@alexcrichton
Copy link
Member

I'd have to do some more investigation to see if this is the right fix or not, but your rationale sounds reasonable! Also yeah that location is fine for some tests.

@critiqjo
Copy link
Author

Another suggestion would be to have a steps_until (number of additional steps to take until we reach or cross the end), and or a steps_before (number of steps before we reach the end) functions in Step trait, instead of steps_between function.
Edit: steps_until() = steps_before() + 1 always.

@alexcrichton
Copy link
Member

It looks like the steps_between function is used for size_hint in the range iterators, so it's likely that the fix should go into that function instead of the StepBy size hint function.

@critiqjo
Copy link
Author

"Fix" only if the definition of steps_between should be changed. Currently steps_between means "number of steps after start, until we reach as close to end (but without stepping over) as possible".

But steps_before (as I defined earler) is logically different. It may return the same result as steps_between or a different one, depending whether or not end can be stepped on. So defining 2 logically different functions would be good.

@critiqjo
Copy link
Author

Just to make myself more clear: in the definitions I stated (of steps_until, steps_before and steps_between), start is not counted as a step.
[steps_until, as I've defined, is "logically similar" to steps_before (always a 1-difference). So ignore its definition]

Suggestion: rename the current steps_between to steps_until, and add a new function steps_before. I think it would avoid confusions.

@alexcrichton
Copy link
Member

Regardless of what the name steps_between implies, it is directly used in the size_hint calculation for ops::Range, so it is intended to be used for size_hint and probably just needs to be updated to handle steps bigger than 1.

@critiqjo
Copy link
Author

Okay... So shall I change the steps_between to exclude end and include start in this pull request itself? (This can be directly used in size_hints then)
How does this look -- it also takes care of negative bys:

            #[inline]
            #[allow(trivial_numeric_casts)]
            #[allow(unused_comparisons)]
            fn steps_between(start: &$t, end: &$t, by: &$t) -> Option<usize> {
                if *by == 0 { return None; }
                let mut start = *start;
                let mut end = *end;
                let mut by = *by;
                if by < 0 {
                    start *= -1;
                    end *= -1;
                    by *= -1;
                }
                if start <= end {
                    let diff = end - start;
                    if diff % by > 0 {
                        Some((diff / by) as usize + 1)
                    } else {
                        Some((diff / by) as usize)
                    }
                } else {
                    Some(0)
                }
            }

The original panics if by is zero, and gives large values if by is negative (negative cast to usize). [original source]

@alexcrichton
Copy link
Member

It doesn't necessarily need to be super robust (as it's a somewhat private API), but I would recommend some exhaustive tests to accompany this change.

@critiqjo
Copy link
Author

I've commited the fix and added tests. Please check...

@critiqjo
Copy link
Author

I think the current implementation of the Step and StepBy is insufficient. All iterators need to have a method named step_by(self, by: usize) which returns (say) a SteppedIterator struct which, by default, simply calls next internally by-times whenever its next is called. (This can be used to implement an iterator over a vector which returns, say, every 4th element in it).
Additionally, implementors of Iterator should be able to make their own versions of step_by which returns an optimized version of SteppedIterator (which doesn't call next many times internally)... Is this planned?

start *= -1;
end *= -1;
by *= -1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this block just necessary for the % check below? It looks like otherwise the signs just end up handling themselves.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that, for all valid (*start, *end, *by) triplet, start < end and all values would be positive... Not needed for unsigned types, should I split it?

@alexcrichton
Copy link
Member

As a meta-point, this may want to coordinate with #24865 as there's likely to be merge conflicts between the two.

I think the current implementation of the Step and StepBy is insufficient. All iterators need to have a method named step_by(self, by: usize)

This is currently not done because, as you mention, specialization would be needed to optimize the implementation for iterators like Range<i32>. This optimization, however, is not possible today. as the returned iterator must be a concrete iterator type which cannot change.

@bluss
Copy link
Member

bluss commented Apr 27, 2015

Maybe we can split so that Range uses a separate method (since it always uses step 1). It's quite important that we get the range iterator right and always computing the exact right size.

@critiqjo
Copy link
Author

Actually, this takes care of both... Step 1, negative or anything...

@bluss
Copy link
Member

bluss commented Apr 28, 2015

We need to take into account the bug I just fixed in #24865. Try a test case with (-128..127i8).step_by(1) to see the problem. A lot more care is needed with the new arithmetic :(

if by < 0 {
start *= -1;
end *= -1;
by *= -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that multiplying with -1 is a potential overflow (-128i8 * -1 => overflow)

@critiqjo
Copy link
Author

Oh! I see... But I don't think we should fix these separately... One fix should take care of unit steps, negative steps, and mutli-steps...

@bluss
Copy link
Member

bluss commented Apr 28, 2015

It's not my call but I'd like my bugfix to be picked into the 1.0 release because it's an embarrasing bug, and then it is better as a separate change. We can then add more features here.

@bluss
Copy link
Member

bluss commented Apr 28, 2015

This strategy almost works:

If by < 0, swap begin and end. Compute length as usual (using end.wrapping_sub(start) as usize). Divide by by.abs().

by.abs() fails if by == -MIN for a signed value, so that needs a further if case to cover that.

@critiqjo
Copy link
Author

I have fixed for most of those corner cases... How does it look?

@bluss
Copy link
Member

bluss commented Apr 28, 2015

I think it looks great. As long as we have lots of test for the corner cases, and those pass, we're fine. The $by == 1 case should optimize nicely and be efficient I think too.

@bors
Copy link
Contributor

bors commented Apr 29, 2015

☔ The latest upstream changes (presumably #24865) made this pull request unmergeable. Please resolve the merge conflicts.

@critiqjo
Copy link
Author

@alexcrichton if this PR seems to add unnecessary complexity to the code, since the StepBy itself might be replaced with a more generalized version in future, then please close this...

@alexcrichton
Copy link
Member

Thanks! This looks good to me, could you squash the commits together as well?

if *start <= *end {
// Note: We assume $t <= isize here
// Use .wrapping_sub and cast to usize to compute the
// difference that may not fit inside the range of isize.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you copy this comment and put it on the signed case code above?

@critiqjo
Copy link
Author

Done! Thank you, guys!

@alexcrichton
Copy link
Member

@bors: r+ 4863680

@bors
Copy link
Contributor

bors commented Apr 30, 2015

⌛ Testing commit 4863680 with merge be2f26b...

@bors
Copy link
Contributor

bors commented Apr 30, 2015

💔 Test failed - auto-mac-64-opt

Fixes `Step::steps_between` implementations by integer types
to correctly handle `by != 1`.
@critiqjo
Copy link
Author

critiqjo commented May 1, 2015

I'm extremely sorry about that... I didn't know about core vs. std (being my first contribution).
When I ran make check earlier, it showed me some unrelated errors (due to improper configuration of valgrind). So I ignored it. Ran it propertly this time and verified.
I've replaced all Step::steps_between tests with size_hint tests since Step trait is marked unstable... Everything should run fine now.

@alexcrichton
Copy link
Member

@bors: r+ 2a8fc9b

No worries! Thanks again!

@bors
Copy link
Contributor

bors commented May 1, 2015

⌛ Testing commit 2a8fc9b with merge 613109d...

bors added a commit that referenced this pull request May 1, 2015
`Iterator::size_hint` can be easily implemented for `StepBy`.
#23708
@bors bors merged commit 2a8fc9b into rust-lang:master May 1, 2015
@critiqjo critiqjo deleted the stepby-sizehint branch May 1, 2015 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants