-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iterator could chain indetermined number of iterators now #6570
Iterator could chain indetermined number of iterators now #6570
Conversation
spec/std/iterator_spec.cr
Outdated
@@ -59,6 +59,55 @@ describe Iterator do | |||
iter.rewind | |||
iter.to_a.should eq([1, 2, 'a', 'b']) | |||
end | |||
describe "chain indeterminate number of iterators" do | |||
it "chains all together" do | |||
n = 10 + rand 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to have random size. Tests should be reproducible.
Anyway, it's probably better to just use a fixed test value instead of generating it on the fly. This leads to simpler code and more easily understandable spec.
Also, there doesn't need to be a lot of iterators and the iterator items can just be simple values.
arrays = [[0, 0], [1, 1], [2, 2]]
iterator = Iterator(Int32).chain arrays
3.times do |i|
iterator.next.should eq i
iterator.next.should eq i
end
Don't make test cases more complicated than they need to be. Keep it simple.
Ditto below.
spec/std/iterator_spec.cr
Outdated
} | ||
end | ||
it "chains empty" do | ||
arrs = [] of Array(Int32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be [] of Iterator(Int32)
.
spec/std/iterator_spec.cr
Outdated
it "chains array of empty" do | ||
n = 10 + rand 10 | ||
arrs = [] of Array(Int32) | ||
n.times { |a| arrs << (a % 3 == 0 ? [] of Int32 : [a, a*a, a + 3]) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Using a literal value is much easier to understand than this generator.
src/iterator.cr
Outdated
@@ -200,6 +200,60 @@ module Iterator(T) | |||
end | |||
end | |||
|
|||
# Like `#chain` but could chains multiple iterators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire doc comment needs linguistic improvements. Feel free to ask for help if you have trouble with the language.
Did you consider merging |
The root of the problem is simply that the iterator types try really hard to keep the iterator types as specific as possible, to enable LLVM to do inlining and other performance optimizations. This makes it really easy for iterator types to grow out of control (actually recursive) when iterator chains are constructed in a loop. This problem isn't specific to |
As @RX14 said, although they both implement the same function, the performance is different. Theoretically, Chain would be faster than ChainAll. However, in the original implementation, if one tries to concat And more, if de-specialized the chain type, recursive calling could occur. In case that the number of iterators to concat is indeterminate, a loop is necessary to implement the function. But actually, de-specialized If all iterators have the same type, If we really care performance, |
Yes, i'm fine with this implementation, I'm still very concerned that this problem isn't unique to |
Finally, I remain `#chain` in the same, for the purpose of performance, but added some comments to instruct this feature. Maybe I had tried my best to improve linguistics, but I'm sorry I'm not good at foreign languages. If it's still not good enough, helps is really really appreciated, thanks.
Finally, I remain |
src/iterator.cr
Outdated
@@ -160,6 +160,11 @@ module Iterator(T) | |||
|
|||
# Returns an iterator that returns elements from the original iterator until | |||
# it is exhausted and then returns the elements of the second iterator. | |||
# Compared to `.chain`, it has better performance when the quantity of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compared to
#chain(Iterator(Iter))
src/iterator.cr
Outdated
@@ -160,6 +160,11 @@ module Iterator(T) | |||
|
|||
# Returns an iterator that returns elements from the original iterator until | |||
# it is exhausted and then returns the elements of the second iterator. | |||
# Compared to `.chain`, it has better performance when the quantity of | |||
# iterators to chain is small (usually less than 4). | |||
# One could also face a problem trying to chain several iterators in loop(like `Enumerable#reduce`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method also cannot chain iterators in a loop, for that see
#chain(Iterator(Iter))
.
@@ -200,6 +205,58 @@ module Iterator(T) | |||
end | |||
end | |||
|
|||
# The same as `#chain`, but have better performance when the quantity of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but has better performance
src/iterator.cr
Outdated
@@ -200,6 +205,58 @@ module Iterator(T) | |||
end | |||
end | |||
|
|||
# The same as `#chain`, but have better performance when the quantity of | |||
# iterators to chain is large(usually greater than 4) or undetermined. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
large (usually
src/iterator.cr
Outdated
# iterators to chain is large(usually greater than 4) or undetermined. | ||
# It uses lesser type specification for more freedom to avoid the failure (at a cost of performance), | ||
# and a loop rather than recursive calling to achieve better performance for numbers of iterators | ||
# (but when the quantity is small, it's not really efficiency). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't tend to discuss implementation details in docs, only user-visible effects, like performance and ability to chain in a loop.
spec/std/iterator_spec.cr
Outdated
@@ -59,6 +59,30 @@ describe Iterator do | |||
iter.rewind | |||
iter.to_a.should eq([1, 2, 'a', 'b']) | |||
end | |||
describe "chain indeterminate number of iterators" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing extra line above this line.
spec/std/iterator_spec.cr
Outdated
iter = Iterator(Int32).chain iters | ||
7.times { |i| iter.next.should eq i } | ||
end | ||
it "chains empty" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing extra line above this line.
spec/std/iterator_spec.cr
Outdated
iter = Iterator(Int32).chain arrs.map(&.each) | ||
iter.next.should be_a Iterator::Stop | ||
end | ||
it "chains array of empty" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing extra line above this line.
spec/std/iterator_spec.cr
Outdated
iter = Iterator(Int32).chain iters | ||
7.times { |i| iter.next.should eq i } | ||
end | ||
it "rewinds" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing extra line above this line.
Does this work when chaining iterators of different types? For example chaining all of |
Yes, it could and I pushed new commit with the spec. |
Please change the implementation of def self.chain(iters : Iterator(Iter)) forall Iter
ChainsAll(Iter, typeof(iters.first.first)).new iters
end That way you can do: Iterator.chain(iterators) instead of: Iterator(YouHaveToWriteTheTypeHereWhenTheCompilerCanFigureThisOut).chain(iterators) And please add that newline I requested :-) Thank you! |
Yeah, I realized the problem but didn't find the solution. This way is brilliant and I updated the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can merge this and if we later find that the performance of merging Chain
with ChainAll
is mostly the same we can unify their implementation while keeping the API interface the same.
…ng#6570) * Iterator could chain indetermined number of iterators now Fixes crystal-lang#6532 * Formatted the code * Simplified the spec & Improved comments. Finally, I remain `#chain` in the same, for the purpose of performance, but added some comments to instruct this feature. Maybe I had tried my best to improve linguistics, but I'm sorry I'm not good at foreign languages. If it's still not good enough, helps is really really appreciated, thanks. * Improved comment linguistic, agian... * Iterator could chain iterators of different type * Type of ChainAll could be inferred by compiler
Fixes #6532