Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing model with vector-valued observations produces error when data are a LinearAlgebra.Adjoint #1481

Open
ElOceanografo opened this issue Dec 2, 2020 · 2 comments

Comments

@ElOceanografo
Copy link
Contributor

This confused me for a while. The following MWE, based on a more elaborate model I've been working on, runs fine sampling with NUTS:

using Turing, Optim
@model function test(x)
    n = length(x)
    σ ~ Exponential()
    x ~ MvNormal(zeros(n), σ)
end
x = randn(1, 10)
m = test(x')
c = sample(m, NUTS(), 100)

...but when trying to optimize it, it throws the following error:

julia> opt = optimize(m, MLE())
ERROR: MethodError: no method matching +(::ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}, ::ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}})
For element-wise addition, use broadcasting with dot syntax: scalar .+ array
Closest candidates are:
  +(::Any, ::Any, ::Any, ::Any...) at operators.jl:538
  +(::ChainRulesCore.DoesNotExist, ::Any) at C:\Users\sam.urmy\.julia\packages\ChainRulesCore\l6bZW\src\differential_arithmetic.jl:23
  +(::ChainRulesCore.One, ::Any) at C:\Users\sam.urmy\.julia\packages\ChainRulesCore\l6bZW\src\differential_arithmetic.jl:94
  ...
Stacktrace:
 [1] acclogp!(::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.VarInfo{NamedTuple{(:σ, :μ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:σ,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{:σ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:μ,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{:μ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}}}},ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}},Array{Base.RefValue{ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}},1}}, ::ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}) at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\threadsafe.jl:19
 [2] tilde_observe(::Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}, ::DynamicPPL.SampleFromPrior, ::DistributionsAD.TuringScalMvNormal{ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},ReverseDiff.TrackedReal{Float64,Float64,Nothing}}, ::LinearAlgebra.Adjoint{Float64,Array{Float64,2}}, ::DynamicPPL.VarName{:x,Tuple{}}, ::Tuple{}, ::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.VarInfo{NamedTuple{(:σ, :μ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:σ,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{:σ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:μ,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{:μ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}}}},ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}},Array{Base.RefValue{ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}},1}}) at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\context_implementations.jl:90
 [3] (::var"#51#52")(::Random._GLOBAL_RNG, ::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::DynamicPPL.ThreadSafeVarInfo{DynamicPPL.VarInfo{NamedTuple{(:σ, :μ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:σ,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{:σ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:μ,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{:μ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}}}},ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}},Array{Base.RefValue{ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}},1}}, ::DynamicPPL.SampleFromPrior, ::Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}, ::LinearAlgebra.Adjoint{Float64,Array{Float64,2}}) at .\none:9
 [4] macro expansion at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\model.jl:0 [inlined]
 [5] _evaluate at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\model.jl:154 [inlined]
 [6] evaluate_threadsafe(::Random._GLOBAL_RNG, ::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::DynamicPPL.VarInfo{NamedTuple{(:σ, :μ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:σ,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{:σ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:μ,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{:μ,Tuple{}},1},ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}},Array{Set{DynamicPPL.Selector},1}}}},ReverseDiff.TrackedReal{Float64,Float64,ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}}, ::DynamicPPL.SampleFromPrior, ::Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}) at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\model.jl:144
 [7] Model at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\model.jl:94 [inlined]
 [8] Model at C:\Users\sam.urmy\.julia\packages\DynamicPPL\wcVCr\src\model.jl:98 [inlined]
 [9] (::Turing.Core.var"#f#24"{DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.SampleFromPrior,Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}})(::ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\core\compat\reversediff.jl:30
 [10] ReverseDiff.GradientTape(::Turing.Core.var"#f#24"{DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64},DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}},DynamicPPL.SampleFromPrior,Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}}, ::Array{Float64,1}, ::ReverseDiff.GradientConfig{ReverseDiff.TrackedArray{Float64,Float64,1,Array{Float64,1},Array{Float64,1}}}) at C:\Users\sam.urmy\.julia\packages\ReverseDiff\NoIPU\src\api\tape.jl:199
 [11] GradientTape at C:\Users\sam.urmy\.julia\packages\ReverseDiff\NoIPU\src\api\tape.jl:198 [inlined]
 [12] tape at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\core\compat\reversediff.jl:41 [inlined]
 [13] taperesult at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\core\compat\reversediff.jl:43 [inlined]
 [14] gradient_logp(::Turing.Core.ReverseDiffAD{false}, ::Array{Float64,1}, ::DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}, ::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::DynamicPPL.SampleFromPrior, ::Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\core\compat\reversediff.jl:33
 [15] gradient_logp(::Array{Float64,1}, ::DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}, ::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::DynamicPPL.SampleFromPrior, ::Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\core\ad.jl:83
 [16] (::Turing.OptimLogDensity{DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}},Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}},DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}})(::Float64, ::Array{Float64,1}, ::Nothing, ::Array{Float64,1}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:160
 [17] (::NLSolversBase.var"#71#72"{NLSolversBase.InplaceObjective{Nothing,Nothing,Turing.OptimLogDensity{DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}},Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}},DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}},Nothing,Nothing},Float64})(::Array{Float64,1}, ::Array{Float64,1}) at C:\Users\sam.urmy\.julia\packages\NLSolversBase\QPnui\src\objective_types\incomplete.jl:55
 [18] value_gradient!!(::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at C:\Users\sam.urmy\.julia\packages\NLSolversBase\QPnui\src\interface.jl:82
 [19] initial_state(::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#15#17"}, 
::Optim.Options{Float64,Nothing}, ::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}) at C:\Users\sam.urmy\.julia\packages\Optim\auGGa\src\multivariate\solvers\first_order\l_bfgs.jl:165
 [20] optimize(::OnceDifferentiable{Float64,Array{Float64,1},Array{Float64,1}}, ::Array{Float64,1}, ::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#15#17"}, ::Optim.Options{Float64,Nothing}) at C:\Users\sam.urmy\.julia\packages\Optim\auGGa\src\multivariate\optimize\optimize.jl:33
 [21] #optimize#85 at C:\Users\sam.urmy\.julia\packages\Optim\auGGa\src\multivariate\optimize\interface.jl:142 [inlined]
 [22] optimize at C:\Users\sam.urmy\.julia\packages\Optim\auGGa\src\multivariate\optimize\interface.jl:141 [inlined]
 [23] _optimize(::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::Turing.OptimLogDensity{DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}},Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}},DynamicPPL.VarInfo{NamedTuple{(:σ, :μ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:σ,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{:σ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{:μ,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{:μ,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}, ::Array{Float64,1}, ::LBFGS{Nothing,LineSearches.InitialStatic{Float64},LineSearches.HagerZhang{Float64,Base.RefValue{Bool}},Optim.var"#15#17"}, ::Optim.Options{Float64,Nothing}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:420
 [24] _optimize at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:411 [inlined] (repeats 2 times)
 [25] #_optimize#35 at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:387 [inlined]
 [26] _optimize(::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::Turing.OptimLogDensity{DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}},Turing.OptimizationContext{DynamicPPL.LikelihoodContext{Nothing}},DynamicPPL.VarInfo{NamedTuple{(, ),Tuple{DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{Exponential{Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}},DynamicPPL.Metadata{Dict{DynamicPPL.VarName{,Tuple{}},Int64},Array{DistributionsAD.TuringScalMvNormal{Array{Float64,1},Float64},1},Array{DynamicPPL.VarName{,Tuple{}},1},Array{Float64,1},Array{Set{DynamicPPL.Selector},1}}}},Float64}}, ::Optim.Options{Float64,Nothing}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:387
 [27] _mle_optimize(::DynamicPPL.Model{var"#51#52",(:x,),(),(),Tuple{LinearAlgebra.Adjoint{Float64,Array{Float64,2}}},Tuple{}}, ::Optim.Options{Float64,Nothing}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:316
 [28] _mle_optimize at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:315 [inlined]
 [29] #optimize#24 at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:295 [inlined]
 [30] optimize at C:\Users\sam.urmy\.julia\packages\Turing\B6rlT\src\modes\ModeEstimation.jl:295 [inlined] (repeats 2 times)
 [31] top-level scope at none:1

After much puzzlement, I figured out that instantiating the model with x' (i.e., a LinearAlgebra.Adjoint) was causing the problem. Changing the eighth line to

m = test(vec(x))

allows the optimization to work as expected. The error doesn't depend on AD or the optimization method; gradient-free optimizers fail also. It seems that optimization fails whenever x is a non-Vector, since test(collect(x')) also produces the same error.

@cpfiffer
Copy link
Member

cpfiffer commented Dec 2, 2020

I dunno if this is a Turing error per se, does anyone with more experience on the AD side agree?

@devmotion
Copy link
Member

I would guess (without having checked the details) that the problem are some of the tilde dispatches in src/modes or DynamicPPL. Both x' and collect(x') are matrices, and hence x ~ MvNormal(...) in the model is interpreted as multiple multivariate observations. The log pdf computation then results in a vector of values which can't be accumulated to varinfo.logp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants