-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion for non-allocating Jacobian #120
Comments
Wow 0.018 ns is faster than lightning ! As Admiral Grace Hopper was fond of saying In one nanosecond light travels about 0.3 meters . The only thing I can guess that is that fast is CPU registry memory ? Maybe that is why the check for " if L > 32 " works ? But my CPU L1 memory cache is much wider than 32 , so hmmm ? And about "I had to use a generated function to hcat .. which apparently always allocates. Let me know if you know a better solution." Suggest Take a look at >> my example using RuntimeGeneratedFunction at this link >>
|
When julia> @btime finite_difference_jacobian(f, Ref($x)[])
12.866 ns (0 allocations: 0 bytes)
3×2 SMatrix{3, 2, Float64, 6} with indices SOneTo(3)×SOneTo(2):
0.0 1.01871
0.298806 0.0
1.0 -3.0 |
Thanks, benchmark updated! 0.018 ns did seem a little too good to be true, although I guess it isn't a bad sign per se when the compiler is able to optimize away the whole thing. I set the upper bound to L = 32 because during my initial prototyping the function seemed to start allocating when passing more than 32 arguments to hcat. But when testing the current implementation the largest non-allocating value seems to be L = 34, go figure. For L > 34 this implementation is still faster and allocates less than ForwardDiff's, but I guess you gotta stop somewhere. I don't know how many versions of the generated function one should allow. In fact, I'm not sure I understand how generated functions work at all. I was expecting it to compile a new method for each L, such that, for example, after calling the function on julia> methods(finite_difference_jacobian)
# 4 methods for generic function "finite_difference_jacobian":
[1] finite_difference_jacobian(f, x::SArray{Tuple{2},T,1,2} where T, fx) in Main at <filename>:15
[2] finite_difference_jacobian(f, x::SArray{Tuple{3},T,1,3} where T, fx) in Main at <filename>:15
[3] finite_difference_jacobian(f, x::SArray{Tuple{7},T,1,7} where T, fx) in Main at <filename>:15
[4] finite_difference_jacobian(f, x) in Main at <filename>:11 But that's not what happens, you only ever see this: julia> methods(finite_difference_jacobian)
# 2 methods for generic function "finite_difference_jacobian":
[1] finite_difference_jacobian(f, x::SArray{Tuple{L},T,1,L} where T, fx) where L in Main at <filename>:15
[2] finite_difference_jacobian(f, x) in Main at <filename>:11 |
It only shows you that, but it's still specializing on each one. This is a good idea and we should include such a specialization. |
This is great! Do you mind to make PR and add some tests? |
I'll get to it eventually, but it doesn't look like it's gonna be a straight copy&paste and it's gonna be a couple of days before I'm able to sit down and familiarize myself with the existing code. If anyone else wants to take a stab in the meantime, feel free---that's why I wanted to share the example right away. |
Hi Daniel @danielwe I'll see if I get a chance to take a stab at it. |
I'm aware of #104 which was closed by #113, but Jacobians still allocate when using StaticArrays, even with a preallocated cache. Here's a barebones solution for a cache-free non-allocating Jacobian:
Output:
Could a specialization along these lines for small SVector problems be relevant for ForwardDiff? Unfortunately, I don't have the time to dig into the codebase and make a PR right now.
I had to use a generated function to
hcat
an arbitrary number of columns without splatting a generator, which apparently always allocates. Let me know if you know a better solution.Edit: Included an
ABSSTEP
.Edit 2: Fixed benchmark that was optimized away.
The text was updated successfully, but these errors were encountered: