-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in ForwardDiff tagging #594
Comments
Hi, thanks for reporting this! It's hard to debut without an MWE but my best guess here is that Optimization.jl might be slightly abusing DI's preparation mechanism. |
I am definitely using views in my problem, so I can see how that would be entering into the call/evaluation of the prepared differentiation operator. I haven't fully familiarized myself with all the layers/abstractions of For my own knowledge, pinning to Thanks again! |
I don't think there is such an explicit constraint, but it could be a more subtle bug. For instance, when preparing some differentiation operator, Optimization.jl could make a julia> x = view(ones(10), 1:2)
2-element view(::Vector{Float64}, 1:2) with eltype Float64:
1.0
1.0
julia> copy(x)
2-element Vector{Float64}:
1.0
1.0 We would need @Vaibhavdixit02's help and/or an MWE from you to seek the root cause.
More precisely, because I handled tagging incorrectly before #571, accidentally omitting custom tags provided by the backend.
Not really, it's mostly useful to untangle derivatives in higher-order autodiff when you have dual numbers containing dual numbers. See this section of the docs for more information. Normally, users should never have to set tags manually or even be aware of their existence. The only reason you're seeing this is because of a hacky fix I put together for ForwardDiff's |
Thanks for all the details! I can work on pulling out a representative example of what I am seeing. @Vaibhavdixit02 any hints on where to focus digging would definitely be appreciated. On the tagging side, given the example you have above, is it possible to tag on a higher-level/more abstract type than |
I've also run into an issue with ForwardDiff tagging in MarginalLogDensities.jl. Here's an example extracted from my test suite that produces the error: using Distributions
using MarginalLogDensities
using ComponentArrays
using LinearAlgebra
using Optimization, OptimizationOptimJL
N = 3
μ = ones(N)
σ = 1.5
d = MvNormal(μ, σ^2 * I)
ld(u, p) = logpdf(d, u)
iw = [1, 3]
iv = [2]
dmarginal = Normal(1.0, σ)
u = randn(N)
v = u[iv]
w = u[iw]
u_component = ComponentArray(v = v, w = w)
x = 1.0:3.0
x_component = ComponentVector(v = x[iv], w = x[iw])
mld = MarginalLogDensity(ld, u, iw, (), LaplaceApprox())
mldc = MarginalLogDensity(ld, u_component, [:w], (), LaplaceApprox())
mld(x[iv], ())
mldc(x_component[[:v]], ()) Error messageERROR: Invalid Tag object:
Expected ForwardDiff.Tag{DifferentiationInterface.FixTail{MarginalLogDensities.var"#f#4"{typeof(ld), ComponentVector{Float64, Vector{Float64}, Tuple{Axis{(v = 1:1, w = 2:3)}}}, Vector{Symbol}, Vector{Symbol}}, Tuple{@NamedTuple{p::Tuple{}, v::Vector{Float64}}}}, ForwardDiff.Dual{ForwardDiff.Tag{DifferentiationInterfaceForwardDiffExt.ForwardDiffOverSomethingHVPWrapper{MarginalLogDensities.var"#f#4"{typeof(ld), ComponentVector{Float64, Vector{Float64}, Tuple{Axis{(v = 1:1, w = 2:3)}}}, Vector{Symbol}, Vector{Symbol}}}, Float64}, Float64, 1}},
Observed ForwardDiff.Tag{DifferentiationInterface.FixTail{MarginalLogDensities.var"#f#4"{typeof(ld), ComponentVector{Float64, Vector{Float64}, Tuple{Axis{(v = 1:1, w = 2:3)}}}, Vector{Symbol}, Vector{Symbol}}, Tuple{@NamedTuple{p::Tuple{}, v::ComponentVector{Float64, Vector{Float64}, Tuple{Axis{(v = 1:1,)}}}}}}, ForwardDiff.Dual{ForwardDiff.Tag{DifferentiationInterfaceForwardDiffExt.ForwardDiffOverSomethingHVPWrapper{MarginalLogDensities.var"#f#4"{typeof(ld), ComponentVector{Float64, Vector{Float64}, Tuple{Axis{(v = 1:1, w = 2:3)}}}, Vector{Symbol}, Vector{Symbol}}}, Float64}, Float64, 1}}.
Stacktrace:
[1] checktag(::Type{…}, f::DifferentiationInterface.FixTail{…}, x::ComponentVector{…})
@ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/config.jl:34
[2] gradient(f::DifferentiationInterface.FixTail{…}, x::ComponentVector{…}, cfg::ForwardDiff.GradientConfig{…}, ::Val{…})
@ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/gradient.jl:17
[3] gradient(f::DifferentiationInterface.FixTail{…}, x::ComponentVector{…}, cfg::ForwardDiff.GradientConfig{…})
@ ForwardDiff ~/.julia/packages/ForwardDiff/PcZ48/src/gradient.jl:17
[4] gradient(f::MarginalLogDensities.var"#f#4"{…}, prep::DifferentiationInterfaceForwardDiffExt.ForwardDiffGradientPrep{…}, ::AutoForwardDiff{…}, x::ComponentVector{…}, contexts::Constant{…})
@ DifferentiationInterfaceForwardDiffExt ~/.julia/packages/DifferentiationInterface/tBGja/ext/DifferentiationInterfaceForwardDiffExt/onearg.jl:337
[5] (::DifferentiationInterfaceForwardDiffExt.var"#inner_gradient#9"{…})(x::ComponentVector{…}, unannotated_contexts::@NamedTuple{…})
@ DifferentiationInterfaceForwardDiffExt ~/.julia/packages/DifferentiationInterface/tBGja/ext/DifferentiationInterfaceForwardDiffExt/secondorder.jl:38
[6] compute_ydual_onearg(f::DifferentiationInterfaceForwardDiffExt.var"#inner_gradient#9"{…}, prep::DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{…}, x::ComponentVector{…}, tx::Tuple{…}, contexts::Constant{…})
@ DifferentiationInterfaceForwardDiffExt ~/.julia/packages/DifferentiationInterface/tBGja/ext/DifferentiationInterfaceForwardDiffExt/onearg.jl:98
[7] pushforward!(f::DifferentiationInterfaceForwardDiffExt.var"#inner_gradient#9"{…}, ty::Tuple{…}, prep::DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{…}, ::AutoForwardDiff{…}, x::ComponentVector{…}, tx::Tuple{…}, contexts::Constant{…})
@ DifferentiationInterfaceForwardDiffExt ~/.julia/packages/DifferentiationInterface/tBGja/ext/DifferentiationInterfaceForwardDiffExt/onearg.jl:153
[8] hvp!(f::OptimizationFunction{…}, tg::Tuple{…}, prep::DifferentiationInterfaceForwardDiffExt.ForwardDiffOverSomethingHVPPrep{…}, ::SecondOrder{…}, x::ComponentVector{…}, tx::Tuple{…}, contexts::Constant{…})
@ DifferentiationInterfaceForwardDiffExt ~/.julia/packages/DifferentiationInterface/tBGja/ext/DifferentiationInterfaceForwardDiffExt/secondorder.jl:72
[9] hessian!
@ ~/.julia/packages/DifferentiationInterface/tBGja/ext/DifferentiationInterfaceSparseMatrixColoringsExt/hessian.jl:97 [inlined]
[10] modal_hessian!
@ ~/.julia/dev/MarginalLogDensities/src/MarginalLogDensities.jl:288 [inlined]
[11] _marginalize(mld::MarginalLogDensity{…}, v::ComponentVector{…}, data::Tuple{}, method::LaplaceApprox{…}, verbose::Bool)
@ MarginalLogDensities ~/.julia/dev/MarginalLogDensities/src/MarginalLogDensities.jl:297
[12] #_#5
@ ~/.julia/dev/MarginalLogDensities/src/MarginalLogDensities.jl:226 [inlined]
[13] (::MarginalLogDensity{…})(v::ComponentVector{…}, data::Tuple{})
@ MarginalLogDensities ~/.julia/dev/MarginalLogDensities/src/MarginalLogDensities.jl:225
[14] top-level scope
@ Untitled-1:24
Some type information was truncated. Use `show(err)` to see complete types. I have recently been making changes to support I will try to simplify this and make a more minimal example, but hopefully it's a starting point... |
Thanks for the MWE! The rules of DI preparation are: you can only reuse In both cases, either your code or Optimization.jl must be breaking these rules due to subtle array conversions, which imply that the preparation result is no longer valid. Not sure what happens for @aml5600, but at least I figured it out for @ElOceanografo. You're accidentally converting a julia> v = ComponentVector(a=[1, 2])
ComponentVector{Int64}(a = [1, 2])
julia> collect(v)
2-element Vector{Int64}:
1
2 Your As for the question of @aml5600, more details below
ForwardDiff's preparation system is probably the most sensitive backend to the "same type" rule because the tag type contains the type of the function FixTail(f, contexts...) = x -> f(x, contexts...) This function wrapper So the answer is that, at least with the current implementation the tag will contain concrete types for all the context arguments. We could make a case for relaxing this constraint, but then this would prevent us from catching bugs such as @ElOceanografo's implicit conversion. |
Hey, sorry for being MIA here was traveling etc. This is happening because Ipopt sends a |
Is it the only solver for which this happens? Maybe you could instantiate the Lagrange multiplier once and for all then copy the view into it at every iteration? |
Yeah, among the frequently used optimizers that seems to be only one. Though just to be clear it's a well-motivated use of
Yeah this is a good idea, I'll do this. |
Also note that ForwardDiff over ReverseDiff is very slow once you add constants because preparation becomes impossible (I can't allow the constants to be absorbed in the tape cause their value might change afterwards) |
@aml5600 @ElOceanografo did you manage to fix your issues? |
Hi @gdalle apologies, things have gotten hectic on my end and haven't had the time to dig into this yet =/ @Vaibhavdixit02 to your comment above, if you give me a rough idea of where that approach would fit in I can try to take a look (probably next week...). Thanks again to you both! |
Yes, ElOceanografo/MarginalLogDensities.jl#33 fixed it for me! |
I unpinned DI from 0.6.10, which updated to 0.6.22. It seems that in the other packages that updated (most importantly, ModelingToolkit/Symbolics) have helped to implicitly solve this issue! |
Okay I'll close this issue for now, feel free to open more specific ones in the future! |
Thanks again for the help! |
I have an NLP that I am solving through
Optimization.jl
using eitherIpopt
orMadNLP
as the solver. Both have previously worked. It seems now that I get an error from ForwardDiff about mismatched tags.I noticed that custom tagging was introduced in
0.6.11
. If I pin to0.6.10
, I do not see the issue. I do not have the full stacktrace on this machine, but can type out the top line where the tagged function and args differ. Also, my underlying functions are a mix of custom transcription over top of MTK models, so the type information is incredibly long. I try to group it into...
.The error is here, where
but the provided function
f::F
iswhere we can see the differing
Vector
vsSubArray
in the Tuples.The tagged input type, and provided to check_tag, are identical, and begin like:
I apologize for the limited information here. If the issue is obvious, wonderful! If not, I can try to work on extracting a MWE as this issue pops up in a much larger process. For now I will pin to
0.6.10
. Thanks!The text was updated successfully, but these errors were encountered: