-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for complex numbers #157
Comments
This is definitely possible using ForwardDiff's |
What happens if you just have |
What's the point of doing that? There are many cases where subtyping to |
Are there any updates on plans to extend functionality to complex numbers? |
There seems to be a simple way forward here: complex differentiation in the sense that matters here is just real differentiation up to a unitary transformation. I am primarily interested in applications like computing a Jacobian for a system of ODEs. Typically when setting up differential equations we often aren't concerned so much about complex analysis and analytic functions, but rather a convenient complex variable shorthand for pairs of independent real variables. So (for this case at least) this is done via: (1) do the unitary operation to map to real variables. (2) differentiate using real routines. Or am I missing the point entirely? |
Yeah, that's equivalent to differentiating along the |
I think DualNumbers.jl has supported complex (first order) differentiation ever since JuliaDiff/DualNumbers.jl#29 |
There are probably edge cases like branch cuts that might not be precisely implemented, but time stepping near a branch cut seems risky for a system of ODEs anyway. |
But ForwardDiff doesn't use DualNumbers.jl, and instead uses its own internal
Yes, I am pretty sure you'll have all sorts of other troubles in this case. The differentiation is the least of the worries here. |
IMO, Note that the bulk of the work for this specific issue does not involve changing |
I'm confused by the discussion here. For a map from complex to complex, differentiation does not map properly to nice structures, ie there is no "jacobian" of a map from C^n to C^n unless the map is holomorphic, which would probably limit the range of applicability. E.g. to solve an equation f(x) = 0, from C^n to C^n by Newton's method (which is presumably what ODE solvers need?), the only way is to setup a R^2n by R^2n system (unless I'm mistaken), which would presumably need a different API. What would make much more sense is automatic differentiation for functions from C^n to R, where a gradient in C^n is well-defined (although one probably would like to use ReverseDiff for that). This is what's needed for optimization, and is conceptually easy (treat C as R^2). I currently need this very badly for one project, so if somebody could point me to the right direction for the implementation and if it's feasible for a beginner in julia, I'd be glad to help. |
@antoine-levitt that's pretty much what @AshtonSBradley was saying. Sometimes complex numbers are used as a nice way to treat equations on R^2 and don't end with with analytic functions. In that case though, the derivative along the line |
That seems tricky to me. There's two use cases here: functions from C to C (solving z^2 = 1, say), and from C to R (minimizing |z|^2 + 1). The API and implementation for both have to be different, because the first has to forbid non-analytic functions (which doesn't seem to be the case now in DualNumbers, e.g. JuliaDiff/DualNumbers.jl#38), while the second requires it. My point is that the first use case is a bit artificial and limited in applicability (why allow solving z^2 = 1 and not z^2 = |z|(1+i) ?), but the second is well-defined, and should probably be what's meant when talking about a "complex derivative" in autodiff. The implementation of the second cannot use "complex dual numbers", which will be unable to differentiate things like real(conj(z)). |
I think the idea behind |
Except for the fact that it's not artificial and extensively used in scientific models like QM and circuits, and essentially most functions which people write down which are C -> C are complex analytic (otherwise the derivative isn't well defined anyways)? There's entire disciplines devoted to this, to the point that everyone is required to learn about C -> C functions. I think it's insane to write that off as artificial when it's easily the most common case for using a complex number.
I think as with JuliaMath/Calculus.jl#115, the second one can be split from C^n to R^2n and differentiate that, but a keyword argument can be used to do this mode. It would require double the function calls though, and most cases wouldn't need the slow path, though it would be necessary for some of the cases @AshtonSBradley has mentioned. |
Thinking about this more, probably default to safe, but keyword arg to opt into performance ( |
This discussion spans several threads so let's focus on it here. I'll try to gather my thoughts clearly in this post, sorry for the length! For the sake of simplicity, let's discuss two prototypical applications: one, finding zeros of exp(cos(z)) = 1.2 with Newton, second, minimizing |z|^2 + Re(z) using a gradient method. If I understand correctly, the first is what you're interested in, the second is what I'm interested in. (let me just point out that this is far from being an academic example: a lot of problems in quantum mechanics are about minimizing C^n -> R functions). I don't know who would win a popularity contest between solving nonlinear systems and minimizing functions, but I'm guessing it would be close. What we are interested in is finding out what, if any, sense there should be in defining derivatives of a function f with complex inputs. There are two sensible definitions here.
These two definitions do not match for a C^n to R function. I'm not sure what other languages do, but e.g. Theano implements the second: http://deeplearning.net/software/theano/proposals/complex_gradient.html The main issue I have with the first option is that not all functions are complex-differentiable. It's true that many functions are, but then again many are not. Moreover, even if the total function is complex-differentiable but has intermediary computations that are not, then the final result will simply be wrong. For instance, defining derivatives this way at each stage of an autodiff algorithm, I'm pretty sure that differentiating f(z) = real(z) + im*imag(z) will not complain but give wrong results. This is the worst kind of bugs, and I think it's a really, really bad idea. So, if we want the first definition, that leaves us with only two consistent choices:
The implementation of the first choice is the simplest, but it's very restrictive, and in particular is useless for the second definition (because of Cauchy-Riemann, any complex-analytic C->R function is constant). The second seems more tricky (I'm not very familiar with autodiff), but is the more general. If that's done, then exposing the first and second definitions is just a matter of API: the code would return a double-size jacobian, from which one can select the relevant entries. It is also the correct way to implement real-to-real functions that happen to have intermediate complex quantities (e.g. t - > abs((1+im)+t(3-2im))). I think this is the way to proceed. |
That's not a good example to have as the only example. Of course the example which focuses on minimizing functions uses the choice which is good for your application (minimizing functions). But then if you point to something which is focused on mathematical modeling like Mathematica, you'll see it uses the first definition.
You pretty much just showed why that's not "the way to proceed": it's much more computationally expensive yet you only need it in these C->R cases, or with intermediate non-analytic functions. That makes it pretty clear that neither way is "correct", and both are necessary if we want to do things in a manner which is sane and efficient. I think we need to focus on how to expose both definitions in a clear manner, and find out how to implement the two of them in a way that doesn't clash. |
As I said, I only see two choices that make sense: undefine all non-analytic functions on complex dual numbers, and the "expensive" option (it should be at worst a factor of two slower, in the bandwidth-bound case). I believe the "expensive" option is the right one (it is the most general one), but since I'm not willing to implement it myself I will not fight against anyone who does the first one, as long as it does not wrongly differentiate non-analytic functions. |
Then I think the first short-term goal would be to use the DualNumbers.jl implementation, but undefine all non-analytic functions on complex dual numbers so that way it's safe. In the meantime, defining both versions using finite differencing in Calculus.jl is not difficult and can be switched using a keyword argument. For autodifferentiation though, we have to start thinking about how Cassette.jl will deal with this. My guess is that things can be tagged somehow by the user passing in a |
There might be a way to have the best of both worlds (a general approach that is as fast as the simple one for analytic functions), by having two types of dual numbers: AnalyticDual that has only df/dz, and GeneralDual that has df/dz and df/dz*. The second type would "propagate", in the sense that f(z) is an AnalyticDual if f is analytic and z an analytic dual, but every other case would give a general dual. All this is known at compile-time, so this could in theory work. No idea how hard it is to implement. |
Cassette.jl could probably do that because it would build the whole computation graph, so it would be easy to just check if there's a function in the "not allowed" list. |
Note that Cassette won't provide an AD implementation, but the plan is to build a new AD implementations on top of it.
For a forward-mode AD implementation, you don't want to build/store the graph. You could still do the check as part of normal call interception, however. Should be pretty simple to leverage dispatch to accomplish this in a way that the compiler can elide for non-
This is the why, IMO, if there is any degree of function-checking required for complex differentiation correctness, it shouldn't be exposed as a togglable option - it should be automatically enforced whenever ForwardDiff sees that's it's working with |
So instead of being toggle-able, have a set of allowed analytic functions and during call interception decide if |
Well, AFAICT, doing that dynamically would require some tricky runtime re-indexing that would probably be best to avoid. I was thinking the AD tool would do the safe slow thing by default, and users can toggle a flag which gives the AD tool license to optimize if it can prove that all intermediate functions are analytic (by checking a trait like you mentioned). So I guess I am fine with a toggle, as long as the tool is able to throw loud, descriptive errors when the toggle is misused 😛 |
The more I think about it the more it seems like the right way to proceed would be to first get real to real with complex as intermediary computations working. This means treating complex numbers as simply structs with two reals. Probably that is not that hard to do in the current setup? Then, AD would compute the full derivative (2n x 2n if from C^n to C^n), and it's just a matter of API to select the right subarray to expose for the special cases of analytic C^n to C^n (as a bonus, you can check explicitly whether the Cauchy-Riemann equations hold and whether your function is analytic or not) and complex-to-real functions. That gets us a reasonable code that people can use (within a factor of 2 of optimal), and then we can refine the algorithm for special cases. Anything else (either specialized to the case of analytic C^n to C^n or to C^n to R functions) feels like premature optimization. |
Handled by JuliaLang/julia#36030 |
It would be nice if the package would support the complex numbers. Probably it is difficult to check when the derivative exist, and what path to take on the complex plane to take derivative. However if you will make default derivative path (for example along real axis) it should be good enough (I think) for DifferentialEquations.jl where derivatives are used to calculate Jacobian in implicit ODE integrators. (@ChrisRackauckas is using
derivative!
andjacobian!
in DifferentialEquations.jl)If it would be possible to indicate the curve on complex plane where to take derivative, it also would be awesome. Thanks.
The text was updated successfully, but these errors were encountered: