Annotate a few types to improve type inference and thus allocation #127

abelsiqueira · 2019-11-16T18:02:30Z

Necessary for JuliaSmoothOptimizers/Krylov.jl#151

abelsiqueira · 2019-11-16T21:33:24Z

@amontoison, can you check that it helps Krylov#151 regarding the allocations tests? I tried locally and it says that usymqr fails, but I checked and it appears to me that there is a wₖ₋₂ unaccounted.

coveralls · 2019-11-17T00:22:15Z

Coverage increased (+0.006%) to 96.308% when pulling 37aa757 on abelsiqueira:annotate-types into 49398fe on JuliaSmoothOptimizers:master.

Necessary for Krylov.

codecov · 2019-11-17T01:46:14Z

Codecov Report

Merging #127 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##           master    #127      +/-   ##
=========================================
+ Coverage    96.3%   96.3%   +<.01%     
=========================================
  Files           8       8              
  Lines         622     623       +1     
=========================================
+ Hits          599     600       +1     
  Misses         23      23

Impacted Files	Coverage Δ
src/LinearOperators.jl	`94.4% <100%> (-0.05%)`	⬇️
src/adjtrans.jl	`98.71% <100%> (+0.05%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 49398fe...37aa757. Read the comment docs.

amontoison · 2019-11-17T06:33:58Z

@abelsiqueira I begin to do some tests tonight, with A.tprod and v0.6 of LinearOperators :

SYMMLQ: 1053952
CG: 789104
MINRES: 1315872
DIOM: 5772176
CG_LANCZOS: 1051264
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5509984
CR: 1053536
CRMR: 661840
CGS: 1313552
CRAIGMR: 1072256
CGNE: 659120
CRAIG: 926192
LSLQ: 681712
CGLS: 546704
LSQR: 678928
CRLS: 923680
LSMR: 810064
USYMQR: 1265888
BILQ: 1577664
MINRES-QLP: 1315728
QMR: 1840032
USYMLQ: 1367840

With A' used instead of A and v0.7 of LinearOperators :

SYMMLQ: 1133712
CG: 795760
MINRES: 1420432
DIOM: 5798080
CG_LANCZOS: 1081248
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5535632
CR: 1099296
CRMR: 679616
CGS: 1327952
CRAIGMR: 1234576
CGNE: 670000
CRAIG: 937440
LSLQ: 1002048
CGLS: 601792
LSQR: 1029456
CRLS: 939600
LSMR: 1265872
USYMQR: 2183536
BILQ: 1707264
MINRES-QLP: 1438912
QMR: 1927232
USYMLQ: 2737984

With A' used instead of A and the modifications of this pull request :

SYMMLQ: 1053952
CG: 789104
MINRES: 1315872
DIOM: 5772176
CG_LANCZOS: 1051264
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5509984
CR: 1053536
CRMR: 666704
CGS: 1313552
CRAIGMR: 1090368
CGNE: 662400
CRAIG: 937440
LSLQ: 694256
CGLS: 561744
LSQR: 696800
CRLS: 927152
LSMR: 827888
USYMQR: 1323264
BILQ: 1581456
MINRES-QLP: 1315728
QMR: 1844112
USYMLQ: 1437552

This pull request restores initial allocations for methods that require only A-products! It also greatly reduces allocations for methods that require both A-products and Aᵀ-products. But something else is still allocating with adjoint products...
I give you the script that returns those results.

test_alloc.zip

abelsiqueira · 2019-11-17T12:46:36Z

Calling A' itself allocate 16 bits to create the adjoint. I've tested

A = LinearOperator(rand(5,5))
v = rand(5)
@allocated op * v  # second time: 128
@allocated op' * v # second time: 144
opt = op'
@allocated opt * v # second time: 128

dpo · 2019-11-17T16:18:11Z

What does code_warntype say when you run crmr with the updated operator annotations?

abelsiqueira · 2019-11-17T16:25:00Z

When I call @code_warntype with the Krylov methods, I don't get a full description:

julia> @code_warntype crmr(Au, b)
Variables
  #self#::Core.Compiler.Const(Krylov.crmr, false)
  A::PreallocatedLinearOperator{Float64}
  b::Array{Float64,1}

Body::Tuple{Array{Float64,1},Krylov.SimpleStats{Float64}}
1 ─ %1 = Krylov.opEye()::Core.Compiler.Const(Identity operator
, false)
│   %2 = Krylov.zero($(Expr(:static_parameter, 1)))::Core.Compiler.Const(0.0, false)
│   %3 = Krylov.eps($(Expr(:static_parameter, 1)))::Core.Compiler.Const(2.220446049250313e-16, false)
│   %4 = √%3::Float64
│   %5 = Krylov.eps($(Expr(:static_parameter, 1)))::Core.Compiler.Const(2.220446049250313e-16, false)
│   %6 = √%5::Float64
│   %7 = Krylov.:(#crmr#33)(%1, %2, %4, %6, 0, false, #self#, A, b)::Tuple{Array{Float64,1},Krylov.SimpleStats{Float64}}
└──      return %7

The result above is my modification with A' instead of A.tprod and with this PR. However, this is the same as when I use v0.7.0 with A.tprod, so not very useful.

abelsiqueira · 2019-11-17T16:36:24Z

The length(v) == size(p, 1) check inside *(Adjoint, v) allocates a lot. @allocated crmr(Au, b) returns 733920 with the check and 716064 without. In v0.6.0 it was 698208.

dpo · 2019-11-17T16:54:23Z

Is it size(p, 1) that allocates?

abelsiqueira · 2019-11-17T17:00:32Z

Both allocate 16 bits.

amontoison · 2019-11-17T17:31:53Z

In all Krylov methods I compute Aᵀ = A' at the beggining, so 16 bits is only allocated once.
I reuse this function define in variants.jl.

function preallocated_LinearOperator(A :: AbstractMatrix, T)
  (n, m) = size(A)
  Ap = zeros(T, n)
  Atq = zeros(T, m)
  return LinearOperator(T, n, m, false, false, p -> mul!(Ap, A, p),
                        q -> mul!(Atq, transpose(A), q), q -> mul!(Atq, transpose(A), q))
end

A = rand(100, 100)
x  = rand(100)
op = preallocated_LinearOperator(A, Float64)
opt = op'

@code_warntype op * x
Variables
  #self#::Core.Compiler.Const(*, false)
  op::LinearOperator{Float64,var"#3#6"{Array{Float64,2},Array{Float64,1}},var"#4#7"{Array{Float64,2},Array{Float64,1}},var"#5#8"{Array{Float64,2},Array{Float64,1}}}
  v::Array{Float64,1}

Body::Array{Float64,1}
1 ─ %1 = LinearOperators.size(v, 1)::Int64
│   %2 = LinearOperators.size(op, 2)::Int64
│   %3 = (%1 == %2)::Bool
└──      goto #3 if not %3
2 ─      goto #4
3 ─ %6 = LinearOperators.LinearOperatorException("shape mismatch")::LinearOperatorException
└──      LinearOperators.throw(%6)
4 ┄ %8 = Base.getproperty(op, :prod)::var"#3#6"{Array{Float64,2},Array{Float64,1}}
│   %9 = (%8)(v)::Array{Float64,1}
└──      return %9

Nothing is in red for op * x, but it's not the case for opt * x.

@code_warntype opt * x
Variables
  #self#::Core.Compiler.Const(*, false)
  op::AdjointLinearOperator{Float64,var"#3#6"{Array{Float64,2},Array{Float64,1}},var"#4#7"{Array{Float64,2},Array{Float64,1}},var"#5#8"{Array{Float64,2},Array{Float64,1}}}
  v::Array{Float64,1}
  p::AbstractLinearOperator{Float64,var"#3#6"{Array{Float64,2},Array{Float64,1}},var"#4#7"{Array{Float64,2},Array{Float64,1}},var"#5#8"{Array{Float64,2},Array{Float64,1}}}
  tprod::Any

Body::Any
1 ──       Core.NewvarNode(:(tprod))
│          (p = Base.getproperty(op, :parent))
│    %3  = LinearOperators.ishermitian(p)::Any
└───       goto #4 if not %3
2 ── %5  = (p * v)::Any
└───       return %5
3 ──       Core.Compiler.Const(:(goto %9), false)
4 ┄─       false
│    %9  = Base.getproperty(p, :ctprod)::Any
│    %10 = (%9 !== LinearOperators.nothing)::Bool
└───       goto #7 if not %10
5 ── %12 = Base.getproperty(p, :ctprod)::Any
│    %13 = (%12)(v)::Any
└───       return %13
6 ──       Core.Compiler.Const(:(goto %17), false)
7 ┄─       false
│          (tprod = Base.getproperty(p, :tprod))
│    %18 = Base.getproperty(p, :tprod)::Any
│    %19 = (%18 === LinearOperators.nothing)::Bool
└───       goto #11 if not %19
8 ── %21 = LinearOperators.issymmetric(p)::Any
└───       goto #10 if not %21
9 ──       (tprod = Base.getproperty(p, :prod))
└───       goto #11
10 ─ %25 = LinearOperators.LinearOperatorException("unable to infer conjugate transpose operator")::LinearOperatorException
└───       LinearOperators.throw(%25)
11 ┄ %27 = Base.broadcasted(LinearOperators.conj, v)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(conj),Tuple{Array{Float64,1}}}
│    %28 = Base.materialize(%27)::Array{Float64,1}
│    %29 = (tprod)(%28)::Any
│    %30 = Base.broadcasted(LinearOperators.conj, %29)::Base.Broadcast.Broadcasted{_A,Nothing,typeof(conj),_B} where _B<:Tuple where _A<:Union{Nothing, Base.Broadcast.BroadcastStyle}
│    %31 = Base.materialize(%30)::Any
└───       return %31

abelsiqueira · 2019-11-17T17:39:25Z

My @code_warntype doesn't look like yours. The last line, in particular, is not Any. This is enforced in this PR.

julia> @code_warntype Atu * b
Variables
  #self#::Core.Compiler.Const(*, false)
  op::AdjointLinearOperator{Float64}
  v::Array{Float64,1}
  p::AbstractLinearOperator{Float64}
  increment_tprod::Bool
  tprod::Any

Body::Array{Float64,1}
1 ──       Core.NewvarNode(:(increment_tprod))
│          Core.NewvarNode(:(tprod))
│    %3  = Base.getproperty(op, :parent)::AbstractLinearOperator{Float64}
│    %4  = Core.apply_type(LinearOperators.AbstractLinearOperator, $(Expr(:static_parameter, 1)))::Core.Compiler.Const(AbstractLinearOperator{Float64}, false)
│          (p = Core.typeassert(%3, %4))
│    %6  = LinearOperators.ishermitian(p)::Any
└───       goto #4 if not %6
2 ── %8  = (p * v)::Array{Float64,1}
└───       return %8
3 ──       Core.Compiler.Const(:(goto %12), false)
4 ┄─       false
│    %12 = Base.getproperty(p, :ctprod)::Any
│    %13 = (%12 !== LinearOperators.nothing)::Bool
└───       goto #6 if not %13
5 ──       LinearOperators.increase_nctprod(p)
│    %16 = Base.getproperty(p, :ctprod)::Any
│    %17 = (%16)(v)::Any
│    %18 = LinearOperators.promote_type($(Expr(:static_parameter, 1)), $(Expr(:static_parameter, 2)))::Core.Compiler.Const(Float64, false)
│    %19 = Core.apply_type(LinearOperators.Vector, %18)::Core.Compiler.Const(Array{Float64,1}, false)
│    %20 = Core.typeassert(%17, %19)::Array{Float64,1}
└───       return %20
6 ──       (tprod = Base.getproperty(p, :tprod))
│          (increment_tprod = true)
│    %24 = Base.getproperty(p, :tprod)::Any
│    %25 = (%24 === LinearOperators.nothing)::Bool
└───       goto #10 if not %25
7 ── %27 = LinearOperators.issymmetric(p)::Any
└───       goto #9 if not %27
8 ──       (increment_tprod = false)
│          (tprod = Base.getproperty(p, :prod))
└───       goto #10
9 ── %32 = LinearOperators.LinearOperatorException("unable to infer conjugate transpose operator")::LinearOperatorException
└───       LinearOperators.throw(%32)
10 ┄       goto #12 if not increment_tprod
11 ─       LinearOperators.increase_ntprod(p)
└───       goto #13
12 ─       LinearOperators.increase_nprod(p)
13 ┄ %38 = Base.broadcasted(LinearOperators.conj, v)::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(conj),Tuple{Array{Float64,1}}}
│    %39 = Base.materialize(%38)::Array{Float64,1}
│    %40 = (tprod)(%39)::Any
│    %41 = Base.broadcasted(LinearOperators.conj, %40)::Base.Broadcast.Broadcasted{_A,Nothing,typeof(conj),_B} where _B where _A
│    %42 = Base.materialize(%41)::Any
│    %43 = LinearOperators.promote_type($(Expr(:static_parameter, 1)), $(Expr(:static_parameter, 2)))::Core.Compiler.Const(Float64, false)
│    %44 = Core.apply_type(LinearOperators.Vector, %43)::Core.Compiler.Const(Array{Float64,1}, false)
│    %45 = Core.typeassert(%42, %44)::Array{Float64,1}
└───       return %45

amontoison · 2019-11-17T17:54:06Z

Sorry, I have the same one now. I downgrade LinearOperators to v0.6 and forget to reupdate it after.

amontoison · 2019-11-17T18:15:24Z

Without length(v) == size(p, 1) check inside *(Adjoint, v), Krylov.jl tests passed again.

CRMR: 663472
CRAIGMR: 1078304
CGNE: 660224
CRAIG: 929952
LSLQ: 685904
CGLS: 551728
LSQR: 684896
CRLS: 924848
LSMR: 816016
USYMQR: 1285024
BILQ: 1577680
QMR: 1840048
USYMLQ: 1391088

amontoison · 2019-11-18T04:07:45Z

I find the second culprit, increase_nctprod(p) is also allocating!
With p = op.parent::Union{LinearOperator{T}, PreallocatedLinearOperator{T}} used instead of p = op.parent::AbstractLinearOperator{T}, increase_nctprod(p) and length(v) == size(p, 1) aren't allocating anymore and Krylov's allocation tests return previous good results.

SYMMLQ: 1053952
CG: 789104
MINRES: 1315872
DIOM: 5772176
CG_LANCZOS: 1051264
CG_LANCZOS_SHIFT_SEQ: 3345184
DQGMRES: 5509984
CR: 1053536
CRMR: 661856
CGS: 1313552
CRAIGMR: 1072272
CGNE: 659136
CRAIG: 926208
LSLQ: 681728
CGLS: 546720
LSQR: 678944
CRLS: 923696
LSMR: 810080
USYMQR: 1265904
BILQ: 1577680
MINRES-QLP: 1315728
QMR: 1840048
USYMLQ: 1367856

dpo · 2019-11-18T18:11:30Z

That's quite promising. But could you show what you're annotating exactly?

amontoison · 2019-11-18T18:44:12Z

src/adjtrans.jl

-  length(v) == size(op.parent, 1) || throw(LinearOperatorException("shape mismatch"))
-  p = op.parent
+function *(op :: AdjointLinearOperator{T}, v :: AbstractVector{S}) where {T,S}
+  p = op.parent::AbstractLinearOperator{T}


It seems that AbstractLinearOperator{T} is too generic for the compiler.
I just replace it by :

p = op.parent::Union{LinearOperator{T}, PreallocatedLinearOperator{T}}

and increase_nctprod(p) plus length(v) == size(p, 1) stop allocating 16 bits at each call.

That's too specific though. Maybe we need to parametrize AdjointLinearOperator with the type of the parent?

I've included a new commit with the parametrized version.

Thanks. Does it resolve the allocation issue?

It looks like it does, but I'll wait for Alexis' tests.

It works with my Krylov tests. 🤘 Thank you Abel! You can merge it and add a new tag.

Before merging, would you mind checking that this doesn't recreate the hvcat issues from #97?

N = 16

@time include("lops.jl") 0.208034 seconds (621.17 k allocations: 31.138 MiB, 5.06% gc time) 0.574005 seconds (1.23 M allocations: 64.762 MiB, 1.83% gc time) 5-element Array{Float64,1}: -10.027547809205767 6.387858545582767 4.945048065166462 9.668653838837873 3.2258245551624194

N = 25

@time include("lops.jl") 0.212824 seconds (621.21 k allocations: 31.152 MiB, 4.82% gc time) 0.561557 seconds (1.23 M allocations: 64.921 MiB, 1.83% gc time) 5-element Array{Float64,1}: 10.628747752647548 -2.02582122181704 -4.198864112230334 3.0656881247622376 10.586622803189744

N = 50

@time include("lops.jl") 0.221327 seconds (621.31 k allocations: 31.201 MiB, 4.93% gc time) 0.592618 seconds (1.23 M allocations: 64.986 MiB, 1.84% gc time) 5-element Array{Float64,1}: -24.72968159374163 -17.87085267785233 21.676648685374687 8.68226076917717 2.8795559248799285

N = 100

@time include("lops.jl") 0.213366 seconds (621.51 k allocations: 31.370 MiB, 4.85% gc time) 0.586298 seconds (1.24 M allocations: 65.185 MiB, 1.76% gc time) 5-element Array{Float64,1}: -6.961226546652531 27.177113229951946 -3.5866280698379254 23.01856931520228 -23.290794085006873

abelsiqueira force-pushed the annotate-types branch from df5d23a to 1042fa7 Compare November 16, 2019 23:58

Annotate a few types to improve type inference and thus allocation

2a16cc3

Necessary for Krylov.

abelsiqueira force-pushed the annotate-types branch from 1042fa7 to 2a16cc3 Compare November 17, 2019 01:46

amontoison reviewed Nov 18, 2019

View reviewed changes

Parametrize Adjoint, Transpose and Conjugate

37aa757

dpo approved these changes Nov 19, 2019

View reviewed changes

abelsiqueira merged commit 0155304 into JuliaSmoothOptimizers:master Nov 19, 2019

abelsiqueira deleted the annotate-types branch November 19, 2019 03:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotate a few types to improve type inference and thus allocation #127

Annotate a few types to improve type inference and thus allocation #127

abelsiqueira commented Nov 16, 2019

abelsiqueira commented Nov 16, 2019

coveralls commented Nov 17, 2019 •

edited

Loading

codecov bot commented Nov 17, 2019 •

edited

Loading

amontoison commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

dpo commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

dpo commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

amontoison commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

amontoison commented Nov 17, 2019

amontoison commented Nov 17, 2019

amontoison commented Nov 18, 2019 •

edited

Loading

dpo commented Nov 18, 2019

amontoison Nov 18, 2019

dpo Nov 18, 2019

abelsiqueira Nov 18, 2019

dpo Nov 18, 2019

abelsiqueira Nov 18, 2019

amontoison Nov 19, 2019

dpo Nov 19, 2019

amontoison Nov 19, 2019

dpo Nov 19, 2019

Annotate a few types to improve type inference and thus allocation #127

Annotate a few types to improve type inference and thus allocation #127

Conversation

abelsiqueira commented Nov 16, 2019

abelsiqueira commented Nov 16, 2019

coveralls commented Nov 17, 2019 • edited Loading

codecov bot commented Nov 17, 2019 • edited Loading

Codecov Report

amontoison commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

dpo commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

dpo commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

amontoison commented Nov 17, 2019

abelsiqueira commented Nov 17, 2019

amontoison commented Nov 17, 2019

amontoison commented Nov 17, 2019

amontoison commented Nov 18, 2019 • edited Loading

dpo commented Nov 18, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Nov 17, 2019 •

edited

Loading

codecov bot commented Nov 17, 2019 •

edited

Loading

amontoison commented Nov 18, 2019 •

edited

Loading