-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use DifferentiationInterface for autodiff, allow ADTypes #153
base: master
Are you sure you want to change the base?
Conversation
Looks like I never updated CI here for Github Actions :) I like the simplifications so far |
function j!(_j, _x) | ||
DI.jacobian!(c!, ccache, _j, jac_prep, backend, _x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe these could be made callable structs to avoid closing over c!
, ccache
, jac_prep
and backend
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were already closures in the existing code, I thought I'd go for the "minimum diff" changes and then we could improve later on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw that. It seemed you're closing over more variables in the DI version though, so it might be even more valuable to change the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to recall that closures are not an issue if the variable we close over does not get assigned to more than once? So we might be good here?
In any case, to hunt down this kind of type inference barriers we would also need to improve annotation of functions in the various structs of the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. Even if it's currently the case, personally I wouldn't rely on it since in my experience the exact behaviour of the compiler is inherently unstable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See for instance the explanations in https://discourse.julialang.org/t/can-someone-explain-closures-to-me/105605
Do we need complex number support? Because it is not available in every AD backend so at the moment DI is only tested on real numbers. |
I ran the Optim.jl test suite with the version of NLSolversBase from this PR and they seem to pass (https://github.com/gdalle/Optim.jl/actions/runs/12166910179/job/33934234795), except on Julia 1.6 because the recent versions of DI no longer support it. |
CI PR is merged |
Should I do another PR which adds code coverage to the test action? |
@devmotion added codecov. Should be set up now |
Alright, closing and reopening to get coverage (hopefully) |
Any idea why we didn't get a Codecov comment? The report is available here but perhaps it doesn't have the right permissions to post on the PR? |
Might be codecov/codecov-action#1662? A bit unclear what the problem actually is, the same config works fine in other repos. I wonder if the codecov app has to be installed for the comments to appear. |
What's also weird is that the Codecov report (on the website) tells me "No file covered by tests were changed", which is obviously wrong |
https://app.codecov.io/github/JuliaNLSolvers/NLSolversBase.jl/pull/153 states that this PR changes coverage |
Fixed the missed line. Should we add some docs? Or keep it on the down low until downstream tests are running smoothly too? |
To be honest, I think adding docs now is better. The risk (that often ends up happening) is that we forget later. It will only be on the dev docs anyway so there's no harm. |
Done |
lc::AbstractVector, uc::AbstractVector, | ||
autodiff::Symbol = :central, | ||
chunk::ForwardDiff.Chunk = checked_chunk(lx)) | ||
# TODO: is con_jac! still useful? we ignore it here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should we do about this? The new version of the code directly computes the Hessian of the sum of constraints
DF = copy(DF) | ||
|
||
x_f, x_df = x_of_nans(x_seed), x_of_nans(x_seed) | ||
f_calls, j_calls = [0,], [0,] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to preserve this call count? I don't think it makes a lot of sense for autodiff in general, because some backends will not go through the actual code to compute the gradient (unlike ForwardDiff and FiniteDiff).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I don't think this call count was present for every operator or every autodiff method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to look. It's done this way because people complained that it didn't match with what they printed from their objective, but I agree that people will have to calculate the number themselves in the special case of finite diff
Well! We do support complex numbers in Optim. I'm unsure about the current testing situation and if it tests AD with complex |
The thing is that it should work out of the box because DI forwards arguments to the relevant backend without constraining their element types. However, DI does not have specific tests on complex inputs, and thus there is nothing to prevent accidental breakage at the moment |
Moreover, not all backends support complex numbers. For instance, ForwardDiff will fail |
Complex outputs are supported: https://github.com/JuliaDiff/ForwardDiff.jl/blob/37c1d50d0f8a68fc410484057581262e1ec6d67d/test/DerivativeTest.jl#L113 |
Sure but I'm assuming we want gradients of real-valued functions, for the purposes of optimization? If those functions have complex inputs, it will fail julia> using ForwardDiff
julia> norm(x) = sum(abs2, x)
norm (generic function with 1 method)
julia> ForwardDiff.gradient(norm, [0.0 + im])
ERROR: ArgumentError: Cannot create a dual over scalar type ComplexF64. If the type behaves as a scalar, define ForwardDiff.can_dual(::Type{ComplexF64}) = true. |
And that's also why I'm reluctant to promise complex support in DI: every backend handles it differently, sometimes with different conventions, sometimes one operator works ( |
I think we require user-written input in the complex case.. @antoine-levitt might remember but it's a long time ago :D I can also check later |
If autodiff is only used for real numbers that will make our life easier. Otherwise we can also add select complex tests to DI |
I don't remember, but it's reasonable to treat complex numbers just like any user defined struct. If the AD backend has support for that (ie it can vjp a CustomType->Real function and give a CustomType) then great, otherwise just ask the user to convert into a vector of real -> Real function |
So what do we do about this PR? At the moment it passes all the tests here and all the ones in Optim.jl. Should we wait until complex numbers have more reliable handling in DI (for instance, erroring when a hessian is requested)? |
Let me have a look over the holiday break. I have to go back and look, because I cannot remember what we actually support here, and if we do support something we cannot just remove it in a feature release. |
Hi @pkofod, gentle bump on this one :) Did you have time to figure out what kind of complex number support you need? |
I actually did not return to this after the holiday break yet. I will have a look again. |
No stress, take as much time as you need! I'm just excited for Optim to finally have reverse mode autodiff but it can definitely wait 😅 |
Warning
This PR bumps the Julia compat to 1.10 instead of 1.5.
This PR introduces the use of ADTypes for autodiff package selection + DifferentiationInterface for autodiff calls (fixes #132).
oncedifferentiable.jl
,twicedifferentiable.jl
andconstraints.jl
autodiff=ADTypes.AutoSomething()
Tests pass locally but I might be misinterpreting the interface somehow so please double-check. I think we can handle performance improvements in a follow-up PR.