Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotate a few types to improve type inference and thus allocation #127

Merged

Conversation

abelsiqueira
Copy link
Member

@abelsiqueira
Copy link
Member Author

@amontoison, can you check that it helps Krylov#151 regarding the allocations tests? I tried locally and it says that usymqr fails, but I checked and it appears to me that there is a wₖ₋₂ unaccounted.

@coveralls
Copy link

coveralls commented Nov 17, 2019

Coverage Status

Coverage increased (+0.006%) to 96.308% when pulling 37aa757 on abelsiqueira:annotate-types into 49398fe on JuliaSmoothOptimizers:master.

@codecov
Copy link

codecov bot commented Nov 17, 2019

Codecov Report

Merging #127 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #127      +/-   ##
=========================================
+ Coverage    96.3%   96.3%   +<.01%     
=========================================
  Files           8       8              
  Lines         622     623       +1     
=========================================
+ Hits          599     600       +1     
  Misses         23      23
Impacted Files Coverage Δ
src/LinearOperators.jl 94.4% <100%> (-0.05%) ⬇️
src/adjtrans.jl 98.71% <100%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 49398fe...37aa757. Read the comment docs.

@amontoison
Copy link
Member

@abelsiqueira I begin to do some tests tonight, with A.tprod and v0.6 of LinearOperators :

SYMMLQ: 1053952
CG: 789104
MINRES: 1315872
DIOM: 5772176
CG_LANCZOS: 1051264
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5509984
CR: 1053536
CRMR: 661840
CGS: 1313552
CRAIGMR: 1072256
CGNE: 659120
CRAIG: 926192
LSLQ: 681712
CGLS: 546704
LSQR: 678928
CRLS: 923680
LSMR: 810064
USYMQR: 1265888
BILQ: 1577664
MINRES-QLP: 1315728
QMR: 1840032
USYMLQ: 1367840

With A' used instead of A and v0.7 of LinearOperators :

SYMMLQ: 1133712
CG: 795760
MINRES: 1420432
DIOM: 5798080
CG_LANCZOS: 1081248
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5535632
CR: 1099296
CRMR: 679616
CGS: 1327952
CRAIGMR: 1234576
CGNE: 670000
CRAIG: 937440
LSLQ: 1002048
CGLS: 601792
LSQR: 1029456
CRLS: 939600
LSMR: 1265872
USYMQR: 2183536
BILQ: 1707264
MINRES-QLP: 1438912
QMR: 1927232
USYMLQ: 2737984

With A' used instead of A and the modifications of this pull request :

SYMMLQ: 1053952
CG: 789104
MINRES: 1315872
DIOM: 5772176
CG_LANCZOS: 1051264
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5509984
CR: 1053536
CRMR: 666704
CGS: 1313552
CRAIGMR: 1090368
CGNE: 662400
CRAIG: 937440
LSLQ: 694256
CGLS: 561744
LSQR: 696800
CRLS: 927152
LSMR: 827888
USYMQR: 1323264
BILQ: 1581456
MINRES-QLP: 1315728
QMR: 1844112
USYMLQ: 1437552

This pull request restores initial allocations for methods that require only A-products! It also greatly reduces allocations for methods that require both A-products and Aᵀ-products. But something else is still allocating with adjoint products...
I give you the script that returns those results.

test_alloc.zip

@abelsiqueira
Copy link
Member Author

Calling A' itself allocate 16 bits to create the adjoint. I've tested

A = LinearOperator(rand(5,5))
v = rand(5)
@allocated op * v  # second time: 128
@allocated op' * v # second time: 144
opt = op'
@allocated opt * v # second time: 128

@dpo
Copy link
Member

dpo commented Nov 17, 2019

What does code_warntype say when you run crmr with the updated operator annotations?

@abelsiqueira
Copy link
Member Author

When I call @code_warntype with the Krylov methods, I don't get a full description:

julia> @code_warntype crmr(Au, b)
Variables
  #self#::Core.Compiler.Const(Krylov.crmr, false)
  A::PreallocatedLinearOperator{Float64}
  b::Array{Float64,1}

Body::Tuple{Array{Float64,1},Krylov.SimpleStats{Float64}}
1%1 = Krylov.opEye()::Core.Compiler.Const(Identity operator
, false)
│   %2 = Krylov.zero($(Expr(:static_parameter, 1)))::Core.Compiler.Const(0.0, false)
│   %3 = Krylov.eps($(Expr(:static_parameter, 1)))::Core.Compiler.Const(2.220446049250313e-16, false)
│   %4 = √%3::Float64%5 = Krylov.eps($(Expr(:static_parameter, 1)))::Core.Compiler.Const(2.220446049250313e-16, false)
│   %6 = √%5::Float64%7 = Krylov.:(#crmr#33)(%1, %2, %4, %6, 0, false, #self#, A, b)::Tuple{Array{Float64,1},Krylov.SimpleStats{Float64}}
└──      return %7

The result above is my modification with A' instead of A.tprod and with this PR. However, this is the same as when I use v0.7.0 with A.tprod, so not very useful.

@abelsiqueira
Copy link
Member Author

The length(v) == size(p, 1) check inside *(Adjoint, v) allocates a lot. @allocated crmr(Au, b) returns 733920 with the check and 716064 without. In v0.6.0 it was 698208.

@dpo
Copy link
Member

dpo commented Nov 17, 2019

Is it size(p, 1) that allocates?

@abelsiqueira
Copy link
Member Author

Both allocate 16 bits.

@amontoison
Copy link
Member

In all Krylov methods I compute Aᵀ = A' at the beggining, so 16 bits is only allocated once.
I reuse this function define in variants.jl.

function preallocated_LinearOperator(A :: AbstractMatrix, T)
  (n, m) = size(A)
  Ap = zeros(T, n)
  Atq = zeros(T, m)
  return LinearOperator(T, n, m, false, false, p -> mul!(Ap, A, p),
                        q -> mul!(Atq, transpose(A), q), q -> mul!(Atq, transpose(A), q))
end

A = rand(100, 100)
x  = rand(100)
op = preallocated_LinearOperator(A, Float64)
opt = op'
@code_warntype op * x
Variables
  #self#::Core.Compiler.Const(*, false)
  op::LinearOperator{Float64,var"#3#6"{Array{Float64,2},Array{Float64,1}},var"#4#7"{Array{Float64,2},Array{Float64,1}},var"#5#8"{Array{Float64,2},Array{Float64,1}}}
  v::Array{Float64,1}

Body::Array{Float64,1}
1%1 = LinearOperators.size(v, 1)::Int64%2 = LinearOperators.size(op, 2)::Int64%3 = (%1 == %2)::Bool
└──      goto #3 if not %3
2 ─      goto #4
3%6 = LinearOperators.LinearOperatorException("shape mismatch")::LinearOperatorException
└──      LinearOperators.throw(%6)
4%8 = Base.getproperty(op, :prod)::var"#3#6"{Array{Float64,2},Array{Float64,1}}
│   %9 = (%8)(v)::Array{Float64,1}
└──      return %9

Nothing is in red for op * x, but it's not the case for opt * x.

@code_warntype opt * x
Variables
  #self#::Core.Compiler.Const(*, false)
  op::AdjointLinearOperator{Float64,var"#3#6"{Array{Float64,2},Array{Float64,1}},var"#4#7"{Array{Float64,2},Array{Float64,1}},var"#5#8"{Array{Float64,2},Array{Float64,1}}}
  v::Array{Float64,1}
  p::AbstractLinearOperator{Float64,var"#3#6"{Array{Float64,2},Array{Float64,1}},var"#4#7"{Array{Float64,2},Array{Float64,1}},var"#5#8"{Array{Float64,2},Array{Float64,1}}}
  tprod::Any

Body::Any
1 ──       Core.NewvarNode(:(tprod))
│          (p = Base.getproperty(op, :parent))
│    %3  = LinearOperators.ishermitian(p)::Any
└───       goto #4 if not %3
2 ── %5  = (p * v)::Any
└───       return %5
3 ──       Core.Compiler.Const(:(goto %9), false)
4 ┄─       false%9  = Base.getproperty(p, :ctprod)::Any%10 = (%9 !== LinearOperators.nothing)::Bool
└───       goto #7 if not %10
5 ── %12 = Base.getproperty(p, :ctprod)::Any%13 = (%12)(v)::Any
└───       return %13
6 ──       Core.Compiler.Const(:(goto %17), false)
7 ┄─       false
│          (tprod = Base.getproperty(p, :tprod))
│    %18 = Base.getproperty(p, :tprod)::Any%19 = (%18 === LinearOperators.nothing)::Bool
└───       goto #11 if not %19
8 ── %21 = LinearOperators.issymmetric(p)::Any
└───       goto #10 if not %21
9 ──       (tprod = Base.getproperty(p, :prod))
└───       goto #11
10%25 = LinearOperators.LinearOperatorException("unable to infer conjugate transpose operator")::LinearOperatorException
└───       LinearOperators.throw(%25)
11%27 = Base.broadcasted(LinearOperators.conj, v)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(conj),Tuple{Array{Float64,1}}}%28 = Base.materialize(%27)::Array{Float64,1}%29 = (tprod)(%28)::Any%30 = Base.broadcasted(LinearOperators.conj, %29)::Base.Broadcast.Broadcasted{_A,Nothing,typeof(conj),_B} where _B<:Tuple where _A<:Union{Nothing, Base.Broadcast.BroadcastStyle}%31 = Base.materialize(%30)::Any
└───       return %31

@abelsiqueira
Copy link
Member Author

My @code_warntype doesn't look like yours. The last line, in particular, is not Any. This is enforced in this PR.

julia> @code_warntype Atu * b
Variables
  #self#::Core.Compiler.Const(*, false)
  op::AdjointLinearOperator{Float64}
  v::Array{Float64,1}
  p::AbstractLinearOperator{Float64}
  increment_tprod::Bool
  tprod::Any

Body::Array{Float64,1}
1 ──       Core.NewvarNode(:(increment_tprod))
│          Core.NewvarNode(:(tprod))
│    %3  = Base.getproperty(op, :parent)::AbstractLinearOperator{Float64}%4  = Core.apply_type(LinearOperators.AbstractLinearOperator, $(Expr(:static_parameter, 1)))::Core.Compiler.Const(AbstractLinearOperator{Float64}, false)
│          (p = Core.typeassert(%3, %4))
│    %6  = LinearOperators.ishermitian(p)::Any
└───       goto #4 if not %6
2 ── %8  = (p * v)::Array{Float64,1}
└───       return %8
3 ──       Core.Compiler.Const(:(goto %12), false)
4 ┄─       false%12 = Base.getproperty(p, :ctprod)::Any%13 = (%12 !== LinearOperators.nothing)::Bool
└───       goto #6 if not %13
5 ──       LinearOperators.increase_nctprod(p)
│    %16 = Base.getproperty(p, :ctprod)::Any%17 = (%16)(v)::Any%18 = LinearOperators.promote_type($(Expr(:static_parameter, 1)), $(Expr(:static_parameter, 2)))::Core.Compiler.Const(Float64, false)
│    %19 = Core.apply_type(LinearOperators.Vector, %18)::Core.Compiler.Const(Array{Float64,1}, false)
│    %20 = Core.typeassert(%17, %19)::Array{Float64,1}
└───       return %20
6 ──       (tprod = Base.getproperty(p, :tprod))
│          (increment_tprod = true)
│    %24 = Base.getproperty(p, :tprod)::Any%25 = (%24 === LinearOperators.nothing)::Bool
└───       goto #10 if not %25
7 ── %27 = LinearOperators.issymmetric(p)::Any
└───       goto #9 if not %27
8 ──       (increment_tprod = false)
│          (tprod = Base.getproperty(p, :prod))
└───       goto #10
9 ── %32 = LinearOperators.LinearOperatorException("unable to infer conjugate transpose operator")::LinearOperatorException
└───       LinearOperators.throw(%32)
10 ┄       goto #12 if not increment_tprod
11 ─       LinearOperators.increase_ntprod(p)
└───       goto #13
12 ─       LinearOperators.increase_nprod(p)
13%38 = Base.broadcasted(LinearOperators.conj, v)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(conj),Tuple{Array{Float64,1}}}%39 = Base.materialize(%38)::Array{Float64,1}%40 = (tprod)(%39)::Any%41 = Base.broadcasted(LinearOperators.conj, %40)::Base.Broadcast.Broadcasted{_A,Nothing,typeof(conj),_B} where _B where _A
│    %42 = Base.materialize(%41)::Any%43 = LinearOperators.promote_type($(Expr(:static_parameter, 1)), $(Expr(:static_parameter, 2)))::Core.Compiler.Const(Float64, false)
│    %44 = Core.apply_type(LinearOperators.Vector, %43)::Core.Compiler.Const(Array{Float64,1}, false)
│    %45 = Core.typeassert(%42, %44)::Array{Float64,1}
└───       return %45

@amontoison
Copy link
Member

Sorry, I have the same one now. I downgrade LinearOperators to v0.6 and forget to reupdate it after.

@amontoison
Copy link
Member

Without length(v) == size(p, 1) check inside *(Adjoint, v), Krylov.jl tests passed again.

CRMR: 663472
CRAIGMR: 1078304
CGNE: 660224
CRAIG: 929952
LSLQ: 685904
CGLS: 551728
LSQR: 684896
CRLS: 924848
LSMR: 816016
USYMQR: 1285024
BILQ: 1577680
QMR: 1840048
USYMLQ: 1391088

@amontoison
Copy link
Member

amontoison commented Nov 18, 2019

I find the second culprit, increase_nctprod(p) is also allocating!
With p = op.parent::Union{LinearOperator{T}, PreallocatedLinearOperator{T}} used instead of p = op.parent::AbstractLinearOperator{T}, increase_nctprod(p) and length(v) == size(p, 1) aren't allocating anymore and Krylov's allocation tests return previous good results.

SYMMLQ: 1053952
CG: 789104
MINRES: 1315872
DIOM: 5772176
CG_LANCZOS: 1051264
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5509984
CR: 1053536
CRMR: 661856
CGS: 1313552
CRAIGMR: 1072272
CGNE: 659136
CRAIG: 926208
LSLQ: 681728
CGLS: 546720
LSQR: 678944
CRLS: 923696
LSMR: 810080
USYMQR: 1265904
BILQ: 1577680
MINRES-QLP: 1315728
QMR: 1840048
USYMLQ: 1367856

@dpo
Copy link
Member

dpo commented Nov 18, 2019

That's quite promising. But could you show what you're annotating exactly?

src/adjtrans.jl Outdated
length(v) == size(op.parent, 1) || throw(LinearOperatorException("shape mismatch"))
p = op.parent
function *(op :: AdjointLinearOperator{T}, v :: AbstractVector{S}) where {T,S}
p = op.parent::AbstractLinearOperator{T}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that AbstractLinearOperator{T} is too generic for the compiler.
I just replace it by :

p = op.parent::Union{LinearOperator{T}, PreallocatedLinearOperator{T}}

and increase_nctprod(p) plus length(v) == size(p, 1) stop allocating 16 bits at each call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's too specific though. Maybe we need to parametrize AdjointLinearOperator with the type of the parent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've included a new commit with the parametrized version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Does it resolve the allocation issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it does, but I'll wait for Alexis' tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works with my Krylov tests. 🤘 Thank you Abel! You can merge it and add a new tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, would you mind checking that this doesn't recreate the hvcat issues from #97?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N = 16

@time include("lops.jl")
  0.208034 seconds (621.17 k allocations: 31.138 MiB, 5.06% gc time)
  0.574005 seconds (1.23 M allocations: 64.762 MiB, 1.83% gc time)
5-element Array{Float64,1}:
 -10.027547809205767 
   6.387858545582767 
   4.945048065166462 
   9.668653838837873 
   3.2258245551624194

N = 25

@time include("lops.jl")
  0.212824 seconds (621.21 k allocations: 31.152 MiB, 4.82% gc time)
  0.561557 seconds (1.23 M allocations: 64.921 MiB, 1.83% gc time)
5-element Array{Float64,1}:
 10.628747752647548 
 -2.02582122181704  
 -4.198864112230334 
  3.0656881247622376
 10.586622803189744 

N = 50

@time include("lops.jl")
  0.221327 seconds (621.31 k allocations: 31.201 MiB, 4.93% gc time)
  0.592618 seconds (1.23 M allocations: 64.986 MiB, 1.84% gc time)
5-element Array{Float64,1}:
 -24.72968159374163  
 -17.87085267785233  
  21.676648685374687 
   8.68226076917717  
   2.8795559248799285

N = 100

@time include("lops.jl")
  0.213366 seconds (621.51 k allocations: 31.370 MiB, 4.85% gc time)
  0.586298 seconds (1.24 M allocations: 65.185 MiB, 1.76% gc time)
5-element Array{Float64,1}:
  -6.961226546652531 
  27.177113229951946 
  -3.5866280698379254
  23.01856931520228  
 -23.290794085006873 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@abelsiqueira abelsiqueira merged commit 0155304 into JuliaSmoothOptimizers:master Nov 19, 2019
@abelsiqueira abelsiqueira deleted the annotate-types branch November 19, 2019 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants