Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the difference between collect and bracket notation for generators #50619

Merged
merged 5 commits into from
Feb 14, 2024

Conversation

pierre-haessig
Copy link
Contributor

As a follow-up to my question on the forum Unclear documentation of collect? (similarity and difference with brackets), here is a tentative expansion of the docstring of Base.collect to clarify both the similarities and differences with the bracket notation for generators.

I'm pretty sure it needs so edits, but here are the motivating factors behind the changes, by decreasing order of importance:

  1. avoid the presence of bracket notation in the example without explanation (in the present docstring, I find it highly implicit)
  2. simplification of examples to focus on the topic (and not on range syntax or filtering in comprehensions)
  3. avoid wasting vertical space by using longer than needed iterators/generators

Feedback on my formulation is welcome!

@pierre-haessig
Copy link
Contributor Author

pierre-haessig commented Jul 21, 2023

CI testing reports an error in abstractarray test. Is this related to my change in the docstring?

Error During Test at /cache/build/default-amdci5-3/julialang/julia-master/julia-7c4eb41535/share/julia/test/abstractarray.jl:762
  Test threw exception
  Expression: #= /cache/build/default-amdci5-3/julialang/julia-master/julia-7c4eb41535/share/julia/test/abstractarray.jl:762 =# @inferred(cat3v(As)) == zeros(2, 2, 2)
  return type Array{Float64, 3} does not match inferred return type Array{_A, 3} where _A
  Stacktrace:
   [1] error(s::String)
     @ Base ./error.jl:35
   [2] macro expansion
     @ Main.Test91Main_abstractarray /cache/build/default-amdci5-3/julialang/julia-master/julia-7c4eb41535/share/julia/stdlib/v1.11/Test/src/Test.jl:669 [inlined]
   [3] test_cat(::Type{Main.Test91Main_abstractarray.TestAbstractArray})
     @ Main.Test91Main_abstractarray /cache/build/default-amdci5-3/julialang/julia-master/julia-7c4eb41535/share/julia/test/abstractarray.jl:762

@jishnub
Copy link
Contributor

jishnub commented Jul 26, 2023

That's an unrelated issue identified in #50550

@pierre-haessig
Copy link
Contributor Author

That's an unrelated issue identified in #50550

OK thanks for the reference. Anyway in between there are not more CI errors. I guess it got fixed in the mean time.

@brenhinkeller brenhinkeller added the docs This change adds or pertains to documentation label Aug 6, 2023
@pierre-haessig
Copy link
Contributor Author

Hello,
To ease reviewing, here is a before/after of the docstring:

Before

(as visible at https://docs.julialang.org/en/v1/base/collections/#Base.collect-Tuple{Any})

collect(collection)

Return an Array of all items in a collection or iterator. For dictionaries, returns
Vector{Pair{KeyType, ValType}}. If the argument is array-like or is an iterator with the
[HasShape](@ref IteratorSize) trait, the result will have the same shape
and number of dimensions as the argument.

Used by comprehensions to turn a generator into an Array.

Examples

julia> collect(1:2:13)
7-element Vector{Int64}:
  1
  3
  5
  7
  9
 11
 13

julia> [x^2 for x in 1:8 if isodd(x)]
4-element Vector{Int64}:
  1
  9
 25
 49

After:

(first paragraph unchanged)

collect(collection)

Return an Array of all items in a collection or iterator. For dictionaries, returns
Vector{Pair{KeyType, ValType}}. If the argument is array-like or is an iterator with the
[HasShape](@ref IteratorSize) trait, the result will have the same shape
and number of dimensions as the argument.

Used by comprehensions to turn a generator into an Array. Thus the bracket notation
of comprehensions can be used instead calling collect on a generator (see 2nd example).
However in the general case of collections, collect is not interchangeable
with the bracket notation (see 1st example).

Examples

julia> collect(1:3) # completely different from [1:3]
3-element Vector{Int64}:
 1
 2
 3

julia> collect(x^2 for x in 1:3) # same output as [x^2 for x in 1:3]
3-element Vector{Int64}:
 1
 4
 9

@pierre-haessig
Copy link
Contributor Author

Any update @jishnub or @brenhinkeller ?

@jishnub
Copy link
Contributor

jishnub commented Sep 30, 2023

I think the warning text should go elsewhere in the documentation, and not in the docstring of collect. The main difference here seems to be in the interpretation of [1:3] vs [j for j in 1:3], which isn't related to collect.

@pierre-haessig
Copy link
Contributor Author

Thanks @jishnub for the feedback. Here is an updated docstring:


Return an Array of all items in a collection or iterator. For dictionaries, returns
a Vector of key=>value [pairs](@ref Pair). If the argument is array-like or is an iterator
with the [HasShape](@ref IteratorSize) trait, the result will have the same shape
and number of dimensions as the argument.

Used by [comprehensions](@ref man-comprehensions) to turn a [generator expression](@ref man-generators)
into an Array. Thus, on generators, the bracket notation can be used instead of calling collect,
see 2nd example.

Examples

Collect items from a UnitRange{Int64} collection:

julia> collect(1:3)
3-element Vector{Int64}:
 1
 2
 3

Collect items from a generator (same output as [x^2 for x in 1:3]):

julia> collect(x^2 for x in 1:3)
3-element Vector{Int64}:
 1
 4
 9

Summary of the changes

compared to my first proposition (Sept 2023):

  • add more links to the manual: generators, comprehensions. Perhaps the link for Pairs is superfluous.
  • expand the description of the case of dictionaries "in plain text" for better readability (in my opinion) compared to the raw type (Vector{Pair{...}})
  • remove the warning about not using brackets on collections. I just kept the warning that brackets should be used only on generators

Also notice that I've interleaved the examples with a short explanatory text (instead of comments in the REPL code). I find it more readable, but this seems uncommon (compared to other docstrings). Is it nevertheless acceptable?

Copy link
Member

@fingolfin fingolfin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a nice improvement to me. Perhaps it could be "even better" but that shouldn't stop us from taking this already?

Or are there specific concerns with the changes I missed / am not aware of?

@jishnub
Copy link
Contributor

jishnub commented Feb 10, 2024

Other than minor comments, this looks good to me

Co-authored-by: Jishnu Bhattacharya <jishnub.github@gmail.com>
@fingolfin
Copy link
Member

@jishnub applied your suggestions and one other trivial fix

@fingolfin fingolfin added the merge me PR is reviewed. Merge when all tests are passing label Feb 13, 2024
@KristofferC KristofferC merged commit 4a13e9e into JuliaLang:master Feb 14, 2024
5 of 8 checks passed
@oscardssmith oscardssmith removed the merge me PR is reviewed. Merge when all tests are passing label Feb 14, 2024
@pierre-haessig
Copy link
Contributor Author

Thanks for your reviews and improvements and thanks for merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants