-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enzyme hessian with closure: "You may be using a constant variable as temporary storage for active memory" #307
Comments
Hi! julia> h(x) = sum((x .* x) .^ 2)
h (generic function with 1 method)
julia> hessian(h, AutoEnzyme(), rand(10))
10×10 Matrix{Float64}:
7.93735 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0
0.0 2.85219 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 6.47397 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 4.03136 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 4.34344 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 9.0601 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 8.02508 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.5375 |
I suggest you open the same issue on the Enzyme side |
Are you sure? I don't see why this should be type unstable, and JET doesn't report any type-instability. Also, the error message "You may be using a constant variable as temporary storage for active memory" rather suggests that a |
Oh you're right, the I think #309 might fix it, can you try out the PR branch with your toy problem or your actual problem and see? |
Oops it breaks more than it fixes 😱 |
@wsmoses any clues here? |
I believe the problem here is the closure. What if you use enzyme autodiff rather than DI (and actually pass multiple arguments, y marked const) |
Given that the backtrace comes from DI.inner_gradient_closure, my hypothesis is indeed that the closure contains both active memory and constant memory -- and that the store of constant memory into the active closure here is the problem described in the linked FAQ. -> recommending you just call autodiff rather than DI, as DI doesn't support multiple arguments [and thus requires the closure] |
Yes, sure, I know how to do it with Enzyme directly, but I hoped I could simplify my code with DifferentiationInterface. |
Yeah this is a known issue for users of DI, where it presently doesn't support effective usage of Enzyme -- specifically multi arg / multi activity. I think it would be prudent for DI to add that, but until then DI will add unnecessary limitations on what Enzyme can presently correctly and efficiently autodiff. I wrote some comments about this tpapp/LogDensityProblemsAD.jl#29 (review) / tpapp/LogDensityProblemsAD.jl#29 (comment) Unfortunately when I commented "Closures were one of the reasons imo that AD.jl didn't take off, so long term I think DI.jl would be well served by both not introducing them itself (my understanding is that it does this successfully), but also not forcing users to create the same closures user-side (which would create similar issues)." but the response was "DI is unlikely to support either multiple arguments or activity specification anytime soon" @gdalle given the number of users who need this, do you think it is possible to higher prioritize? |
@jbrea if you'd like I can also add some nice syntax sugar like Enzyme.hessian (with multiple arguments). Would that be useful to you? |
Oh, that's kind, thanks. If it is a super quick thing to do, why not. But it doesn't bother me much. ForwardDiff is usually fast enough for the Hessians I need. Failed type analysis with Enzyme bothers me more (e.g. EnzymeAD/Enzyme.jl#1410, but I also ran into a failed type analysis error for a gradient calculation, which I couldn't condense to a MWE yet.) |
It would certainly be useful to me! Right now DI is the only simple way to compute many operators with Enzyme (most notably the hessian), it would be awesome to see some of those operators taken care of. Feel free to ping me on the PR for review if you end up doing it! |
I'm not sure only way is accurate. We describe how to make hessians in our basics guide: https://enzyme.mit.edu/index.fcgi/julia/stable/generated/autodiff/ |
Which is why I said only simple way ;) |
Ah missed that, my b |
No worries! Enzyme is amazing, its only flaw is a rather steep entrance cost, but that's why I'm working on lowering it :) I opened an issue on the Enzyme side to discuss: |
Also I stand corrected, I forgot https://github.com/SciML/OptimizationBase.jl/blob/b2251ec6831c3631fe6db74e8f1b7bdefeb9c0b8/ext/OptimizationEnzymeExt.jl#L38 served as an easy way to use Enzyme for hessians as well. |
The issue with multiple arguments or activities is not adding it, it's testing it properly in all possible cases. The reason DI is so reliable is because we have thousands of LOCs to test every operator in every situation. Adding activities multiplies the number of possible cases by at least 2, so I shudder to think of the effort necessary |
I have opened #311 to discuss this specifically |
Thanks to #489, this now works on the main branch (soon to be released): julia> using DifferentiationInterface
julia> import Enzyme
julia> g(x, y) = sum((x .* y).^2)
g (generic function with 1 method)
julia> x, y = rand(10), rand(10);
julia> hessian(g, AutoEnzyme(), x, Constant(y))
freeing without malloc %arrayref37.3_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %43, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37.2_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %47, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37.1_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %51, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %55, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37.epil_cache.i.0.i = phi double* [ undef, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i ], [ %105, %L177.i.i.loopexit ]
freeing without malloc %arrayref37.3_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %43, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37.2_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %47, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37.1_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %51, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37_cache.i.0.i = phi double* [ undef, %L131.lr.ph.i.i ], [ %55, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i.loopexit ]
freeing without malloc %arrayref37.epil_cache.i.0.i = phi double* [ undef, %L128.L177.loopexit_crit_edge.unr-lcssa.i.i ], [ %105, %L177.i.i.loopexit ]
10×10 Matrix{Float64}:
0.350807 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.353255 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.519147 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 1.23933 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.50778 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0331721 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.179644 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.409008 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.130913 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0760922 |
The text was updated successfully, but these errors were encountered: