Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DifferentiationInterface for autodiff, allow ADTypes #153

Draft
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

gdalle
Copy link
Contributor

@gdalle gdalle commented Dec 4, 2024

Warning

This PR bumps the Julia compat to 1.10 instead of 1.5.

This PR introduces the use of ADTypes for autodiff package selection + DifferentiationInterface for autodiff calls (fixes #132).

  • bump package version to 7.9.0
  • bump Julia compat to 1.10
  • add ADTypes and DifferentiationInterface to dependencies
  • modify oncedifferentiable.jl, twicedifferentiable.jl and constraints.jl
  • add docs for passing autodiff=ADTypes.AutoSomething()
  • adjust count of function/gradient calls?

Tests pass locally but I might be misinterpreting the interface somehow so please double-check. I think we can handle performance improvements in a follow-up PR.

@gdalle gdalle marked this pull request as draft December 4, 2024 12:02
@pkofod
Copy link
Member

pkofod commented Dec 4, 2024

Looks like I never updated CI here for Github Actions :) I like the simplifications so far

Comment on lines +144 to +145
function j!(_j, _x)
DI.jacobian!(c!, ccache, _j, jac_prep, backend, _x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these could be made callable structs to avoid closing over c!, ccache, jac_prep and backend?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were already closures in the existing code, I thought I'd go for the "minimum diff" changes and then we could improve later on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I saw that. It seemed you're closing over more variables in the DI version though, so it might be even more valuable to change the design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to recall that closures are not an issue if the variable we close over does not get assigned to more than once? So we might be good here?
In any case, to hunt down this kind of type inference barriers we would also need to improve annotation of functions in the various structs of the package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. Even if it's currently the case, personally I wouldn't rely on it since in my experience the exact behaviour of the compiler is inherently unstable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gdalle
Copy link
Contributor Author

gdalle commented Dec 4, 2024

Do we need complex number support? Because it is not available in every AD backend so at the moment DI is only tested on real numbers.

@gdalle
Copy link
Contributor Author

gdalle commented Dec 4, 2024

I ran the Optim.jl test suite with the version of NLSolversBase from this PR and they seem to pass (https://github.com/gdalle/Optim.jl/actions/runs/12166910179/job/33934234795), except on Julia 1.6 because the recent versions of DI no longer support it.

@pkofod
Copy link
Member

pkofod commented Dec 6, 2024

CI PR is merged

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Should I do another PR which adds code coverage to the test action?

@gdalle gdalle mentioned this pull request Dec 6, 2024
@pkofod
Copy link
Member

pkofod commented Dec 6, 2024

@devmotion added codecov. Should be set up now

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Alright, closing and reopening to get coverage (hopefully)

@gdalle gdalle closed this Dec 6, 2024
@gdalle gdalle reopened this Dec 6, 2024
@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Any idea why we didn't get a Codecov comment? The report is available here but perhaps it doesn't have the right permissions to post on the PR?

@devmotion
Copy link
Contributor

Might be codecov/codecov-action#1662? A bit unclear what the problem actually is, the same config works fine in other repos. I wonder if the codecov app has to be installed for the comments to appear.

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

What's also weird is that the Codecov report (on the website) tells me "No file covered by tests were changed", which is obviously wrong

@devmotion
Copy link
Contributor

https://app.codecov.io/github/JuliaNLSolvers/NLSolversBase.jl/pull/153 states that this PR changes coverage

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Fixed the missed line. Should we add some docs? Or keep it on the down low until downstream tests are running smoothly too?

@gdalle gdalle changed the title Start DI integration Use DifferentiationInterface for autodiff, allow ADTypes Dec 6, 2024
@pkofod
Copy link
Member

pkofod commented Dec 6, 2024

Fixed the missed line. Should we add some docs? Or keep it on the down low until downstream tests are running smoothly too?

To be honest, I think adding docs now is better. The risk (that often ends up happening) is that we forget later. It will only be on the dev docs anyway so there's no harm.

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Done

lc::AbstractVector, uc::AbstractVector,
autodiff::Symbol = :central,
chunk::ForwardDiff.Chunk = checked_chunk(lx))
# TODO: is con_jac! still useful? we ignore it here
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should we do about this? The new version of the code directly computes the Hessian of the sum of constraints

DF = copy(DF)

x_f, x_df = x_of_nans(x_seed), x_of_nans(x_seed)
f_calls, j_calls = [0,], [0,]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to preserve this call count? I don't think it makes a lot of sense for autodiff in general, because some backends will not go through the actual code to compute the gradient (unlike ForwardDiff and FiniteDiff).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, I don't think this call count was present for every operator or every autodiff method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to look. It's done this way because people complained that it didn't match with what they printed from their objective, but I agree that people will have to calculate the number themselves in the special case of finite diff

@pkofod
Copy link
Member

pkofod commented Dec 6, 2024

Do we need complex number support? Because it is not available in every AD backend so at the moment DI is only tested on real numbers.

Well! We do support complex numbers in Optim. I'm unsure about the current testing situation and if it tests AD with complex

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Well! We do support complex numbers in Optim. I'm unsure about the current testing situation and if it tests AD with complex

The thing is that it should work out of the box because DI forwards arguments to the relevant backend without constraining their element types. However, DI does not have specific tests on complex inputs, and thus there is nothing to prevent accidental breakage at the moment

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Moreover, not all backends support complex numbers. For instance, ForwardDiff will fail

@devmotion
Copy link
Contributor

Moreover, not all backends support complex numbers. For instance, ForwardDiff will fail

Complex outputs are supported: https://github.com/JuliaDiff/ForwardDiff.jl/blob/37c1d50d0f8a68fc410484057581262e1ec6d67d/test/DerivativeTest.jl#L113

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

Sure but I'm assuming we want gradients of real-valued functions, for the purposes of optimization? If those functions have complex inputs, it will fail

julia> using ForwardDiff

julia> norm(x) = sum(abs2, x)
norm (generic function with 1 method)

julia> ForwardDiff.gradient(norm, [0.0 + im])
ERROR: ArgumentError: Cannot create a dual over scalar type ComplexF64. If the type behaves as a scalar, define ForwardDiff.can_dual(::Type{ComplexF64}) = true.

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

And that's also why I'm reluctant to promise complex support in DI: every backend handles it differently, sometimes with different conventions, sometimes one operator works (ForwardDiff.derivative) and one doesn't (ForwardDiff.gradient), etc. All of this makes it really hard to obtain homogenous behavior

@pkofod
Copy link
Member

pkofod commented Dec 6, 2024

I think we require user-written input in the complex case.. @antoine-levitt might remember but it's a long time ago :D I can also check later

@gdalle
Copy link
Contributor Author

gdalle commented Dec 6, 2024

If autodiff is only used for real numbers that will make our life easier. Otherwise we can also add select complex tests to DI

@antoine-levitt
Copy link
Contributor

I don't remember, but it's reasonable to treat complex numbers just like any user defined struct. If the AD backend has support for that (ie it can vjp a CustomType->Real function and give a CustomType) then great, otherwise just ask the user to convert into a vector of real -> Real function

@gdalle
Copy link
Contributor Author

gdalle commented Dec 13, 2024

So what do we do about this PR? At the moment it passes all the tests here and all the ones in Optim.jl. Should we wait until complex numbers have more reliable handling in DI (for instance, erroring when a hessian is requested)?

@pkofod
Copy link
Member

pkofod commented Dec 21, 2024

So what do we do about this PR? At the moment it passes all the tests here and all the ones in Optim.jl. Should we wait until complex numbers have more reliable handling in DI (for instance, erroring when a hessian is requested)?

Let me have a look over the holiday break. I have to go back and look, because I cannot remember what we actually support here, and if we do support something we cannot just remove it in a feature release.

@gdalle
Copy link
Contributor Author

gdalle commented Jan 23, 2025

Hi @pkofod, gentle bump on this one :) Did you have time to figure out what kind of complex number support you need?

@pkofod
Copy link
Member

pkofod commented Jan 23, 2025

Hi @pkofod, gentle bump on this one :) Did you have time to figure out what kind of complex number support you need?

I actually did not return to this after the holiday break yet. I will have a look again.

@gdalle
Copy link
Contributor Author

gdalle commented Jan 23, 2025

No stress, take as much time as you need! I'm just excited for Optim to finally have reverse mode autodiff but it can definitely wait 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extending autodiff compatibilities
4 participants