Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pure kwarg to map #71

Merged
merged 45 commits into from
Sep 1, 2021
Merged

add pure kwarg to map #71

merged 45 commits into from
Sep 1, 2021

Conversation

bkamins
Copy link
Member

@bkamins bkamins commented Aug 13, 2021

Fixes #63

I was thinking if it is possible to make pure=false branch faster, but it seems to be hard in general as we are not sure about types of values returned by f.

@bkamins bkamins marked this pull request as ready for review August 13, 2021 19:22
@bkamins bkamins requested review from nalimilan and quinnj August 13, 2021 19:22
src/PooledArrays.jl Outdated Show resolved Hide resolved
src/PooledArrays.jl Outdated Show resolved Hide resolved
@nalimilan
Copy link
Member

I was thinking if it is possible to make pure=false branch faster, but it seems to be hard in general as we are not sure about types of values returned by f.

That's indeed more work, but I think it's worth the increased complexity. Making a temporary copy doesn't sound acceptable to me, especially given that the temporary vector will likely take much more memory than the final PooledArray. We should just copy the collect implementation from Base, which starts allocating an array with an element type corresponding to the first value, and widens the eltype later if needed.

src/PooledArrays.jl Outdated Show resolved Hide resolved
src/PooledArrays.jl Show resolved Hide resolved
bkamins and others added 2 commits August 13, 2021 23:38
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
@bkamins
Copy link
Member Author

bkamins commented Aug 13, 2021

which starts allocating an array with an element type corresponding to the first value, and widens the eltype later if needed.

Yes, this is also what we do in DataFrames.jl. I will do it then.

src/PooledArrays.jl Outdated Show resolved Hide resolved
src/PooledArrays.jl Outdated Show resolved Hide resolved
src/PooledArrays.jl Outdated Show resolved Hide resolved
src/PooledArrays.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me; couple of test fixes I suggested

test/runtests.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated Show resolved Hide resolved
src/PooledArrays.jl Outdated Show resolved Hide resolved
@bkamins
Copy link
Member Author

bkamins commented Aug 25, 2021

@nalimilan - so how do you think we should go about this PR?

Copy link
Member

@nalimilan nalimilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot about it. Looks mostly good now.

@quinnj Are you fine with the strategy we use to choose the reference type?

@@ -139,6 +140,8 @@ _widen(::Type{Int32}) = Int64
Freshly allocate `PooledArray` using the given array as a source where each
element will be referenced as an integer of the given type.

`PooledArray` constructor is not type stable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about explaining why that's the case, making the link with the next paragraph? And isn't the constructor type stable when reftype is specified?

BTW, maybe we should mark the constructor as @inline to ensure that when the default value of keyword arguments is used, the return type is inferred?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added inline. I have rewritten the docstring. Indeed when reftype is passed the constructor is type stable.

@@ -500,3 +500,56 @@ end
pa2 = repeat(pa1, inner = (2, 1))
@test pa2 == [1 2; 1 2; 3 4; 3 4]
end

@testset "map pure tests" begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a few @inferred checks where applicable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add the @inferred tests elsewhere, as here nothing is inferred since now inference produces the Union of two concrete types, see:

julia> x = PooledArray(fill(1, 200), signed=true, compress=true);

julia> @inferred map(Int, x)
ERROR: return type PooledVector{Int64, Int8, Vector{Int8}} does not match inferred return type Union{PooledVector{Int64, 
Int64, Vector{Int64}}, PooledVector{Int64, Int8, Vector{Int8}}}

julia> @inferred map(Int, x, pure=true)
ERROR: return type PooledVector{Int64, Int8, Vector{Int8}} does not match inferred return type Union{PooledVector{Int64, 
Int64, Vector{Int64}}, PooledVector{Int64, Int8, Vector{Int8}}}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added @inferred with Union for map tests

src/PooledArrays.jl Outdated Show resolved Hide resolved
bkamins and others added 2 commits August 29, 2021 10:58
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
test/runtests.jl Outdated Show resolved Hide resolved
if lbl != zero(I)
labels[i] = lbl
else
if nlabels == typemax(I) || Ti !== T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the Ti !== T check here be Ti <: T to account for Ti == Int64 and T == Union{Int64, Missing}?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right

else
if nlabels == typemax(I) || Ti !== T
I2 = nlabels == typemax(I) ? Int : I
T2 = Ti isa T ? T : Base.promote_typejoin(T, Ti)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Ti isa T is correct here; it should be Ti <: T or vi isa T, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I switched to vi isa T everywhere. Thank you for spotting

typeof(similar(x.refs, Int, ntuple(i -> 0, ndims(x.refs))))}} where {R, N, RA}
pure && return _map_pure(f, x)
length(x) == 0 && return PooledArray([f(v) for v in x])
v1 = f(x[1])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quinnj and @nalimilan - just to double check. Are we sure that PooledArray always uses 1-based indexing and even if it is multidimensional it can use a linear index?

I recall @nalimilan recently giving some comment that potentially a non-standard refs field could be used. On the other hand eachindex(IndexLinear(), ::PooledArray) falls back to the default implementation:

eachindex(::IndexLinear, A::AbstractArray) = (@inline; oneto(length(A)))
eachindex(::IndexLinear, A::AbstractVector) = (@inline; axes1(A))

I will have a look into it later if you do not have an immediate answer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that in theory any AbstractArray type can be used as refs, but in practice we probably haven't checked that things work for non-1-based indexing, and the fallbacks you show assume 1-based indexing, right? So I'd say it's OK to assume 1 for now and fix that later if somebody complains.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok - then I have added an appropriate check in the inner constructor.

src/PooledArrays.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated
Comment on lines 133 to 135
for signed in (true, false), compress in (true, false)
@test_throws ErrorException @inferred PooledArray([1, 2, 3])
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking this is a bit extreme: would it be a problem if inference managed to get this right? :-D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objective is different - the moment the inference will get it right we will know it.
OTOH - advanced users reading the tests will know that we have a problem here (BTW - I have forgotten to add kwargs in the call I will fix this)

test/map_inference.jl Outdated Show resolved Hide resolved
test/runtests.jl Outdated
x = PooledArray(fill(1, len), signed=true, compress=true);
VERSION >= v"1.6" && @inferred PooledVector{Int, Int, Vector{Int}} map(identity, x)
end
VERSION >= v"1.6" && include("map_inference.jl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you avoid this by using @static?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I will try

@nalimilan
Copy link
Member

FWIW I've made a PR to make map inferrable with union types in Base: JuliaLang/julia#42046

bkamins and others added 2 commits August 31, 2021 18:25
Co-authored-by: Milan Bouchet-Valat <nalimilan@club.fr>
@bkamins
Copy link
Member Author

bkamins commented Aug 31, 2021

suggestions applied

@bkamins
Copy link
Member Author

bkamins commented Sep 1, 2021

Some benchmarks before merging.

This PR:

julia> f(x) = x
f (generic function with 1 method)

julia> x = collect(1:10^7);

julia> @time map(f, x);
  0.064935 seconds (47.70 k allocations: 79.048 MiB, 6.07% gc time, 50.26% compilation time)

julia> @time map(f, x);
  0.037640 seconds (3 allocations: 76.294 MiB, 42.38% gc time)

julia> y = PooledArray(x);

julia> @time map(f, y);
  2.367527 seconds (2.12 M allocations: 709.811 MiB, 4.17% gc time, 26.34% compilation time)

julia> @time map(f, y);
  1.752006 seconds (114 allocations: 580.986 MiB, 2.15% gc time)

julia> @time map(f, y, pure=true);
  0.955048 seconds (15.25 k allocations: 658.140 MiB, 9.46% gc time, 7.42% compilation time)

julia> @time map(f, y, pure=true);
  0.759987 seconds (46 allocations: 657.178 MiB)

julia> g(x) = 1
g (generic function with 1 method)

julia> @time map(g, y);
  0.317237 seconds (257.28 k allocations: 54.041 MiB, 42.06% gc time, 30.06% compilation time)

julia> @time map(g, y);
  0.091077 seconds (24 allocations: 38.148 MiB)

julia> @time map(g, y, pure=true);
 15.982458 seconds (70.00 M allocations: 2.581 GiB, 2.08% gc time, 0.28% compilation time)

julia> @time map(g, y, pure=true);
 15.923589 seconds (70.00 M allocations: 2.581 GiB, 2.34% gc time)

and before this PR:

julia> f(x) = x
f (generic function with 1 method)

julia> y = PooledArray(1:10^7);

julia> @time map(f, y);
  1.341482 seconds (1.30 M allocations: 810.270 MiB, 7.02% gc time, 35.56% compilation time)

julia> @time map(f, y);
  0.762432 seconds (35 allocations: 733.471 MiB)

julia> g(x) = 1
g (generic function with 1 method)

julia> @time map(g, y);
 16.764567 seconds (70.11 M allocations: 2.588 GiB, 2.46% gc time, 0.38% compilation time)

julia> @time map(g, y);
 16.064969 seconds (70.00 M allocations: 2.581 GiB, 2.14% gc time)

so all looks relatively good.

@bkamins bkamins merged commit f87b540 into main Sep 1, 2021
@bkamins bkamins deleted the bkamins-patch-2 branch September 1, 2021 06:20
@bkamins
Copy link
Member Author

bkamins commented Sep 1, 2021

Thank you! I will now make a release

@nalimilan
Copy link
Member

Thanks. Why is pure=true so slow (before and after the PR) with g(x) = 1? Looks like we have a type instability somewhere? Of course that's for an extreme example with only unique values.

@bkamins
Copy link
Member Author

bkamins commented Sep 1, 2021

I think the reason is:

refarray = map(x->translate[x], x.refs)

that most likely leads to boxing. I have not checked (doing leftjoin! PR today 😄). If you have time can you please have a look (if not I can look into it later)

@bkamins
Copy link
Member Author

bkamins commented Sep 1, 2021

I have checked. It seems to be type stable, but just this loop:

i = 1
        for (k, k1) in zip(ks, ks1)
            if haskey(newinvpool, k1)
                translate[x.invpool[k]] = newinvpool[k1]
            else
                newinvpool[k1] = i
                translate[x.invpool[k]] = i
                i+=1
            end
        end

is expensive (@quinnj + @nalimilan => of course double checking would be welcome).

@nalimilan
Copy link
Member

I confirm the function appears to be type-stable, but there are nonetheless lots of allocations:

        - function _map_pure(f, x::PooledArray)
 80000080     ks = collect(keys(x.invpool))
 40000080     vs = collect(values(x.invpool))
        0     ks1 = map(f, ks)
       16     uks = Set(ks1)
        0     if length(uks) < length(ks1)
        -         # this means some keys have repeated
        0         newinvpool = Dict{eltype(ks1), eltype(vs)}()
        0         translate = Dict{eltype(vs), eltype(vs)}()
        -         i = 1
799991968         for (k, k1) in zip(ks, ks1)
        0             if haskey(newinvpool, k1)
159983616                 translate[x.invpool[k]] = newinvpool[k1]
        -             else
        0                 newinvpool[k1] = i
       16                 translate[x.invpool[k]] = i
1119999952                 i+=1
        -             end
        -         end
        0         refarray = map(x->translate[x], x.refs)
        -     else
        0         newinvpool = Dict(zip(ks1, vs))
        0         refarray = copy(x.refs)
        -     end
       16     return PooledArray(RefArray(refarray), newinvpool)
        - end

I assume this is due to filling the translate dict. I don't understand why we don't simply use a vector rather than a dict. FWIW this code has been introduced by 02dbbdd (Cc: @shashi).

Also, we call collect(keys(x.invpool)) and collect(values(x.invpool)), but that doesn't seem necessary, right?

@bkamins
Copy link
Member Author

bkamins commented Sep 5, 2021

Yes - I confirm it allocates a lot. I have not analyzed the code. But looking at it now:

  1. collect does not seem necessary.
  2. indeed vector should be enough
  3. instead of creating uks it should be a bit more efficient to use allunique

If @shashi will not be able to work on it, I can make a PR fixing the issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Purity assumption of map
3 participants