-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increased performance of the discrete adjoint solver by using Templates for Linear Solvers #653
Increased performance of the discrete adjoint solver by using Templates for Linear Solvers #653
Conversation
…of CJacobiTransposedPreconditioner (not implemented)
@pcarruscag Thanks for taking a lead on that and picking that up! I wanted to do that for a very long time, but never had enough time 👍 I will have a look into it soon. |
This has been sitting here for way too long; my apologies, @pcarruscag . Thanks for taking the lead and making the effort. The changes look good to me, but I'm no expert in templating myself, so I'd rather have somebody else with more experience in the topic approving it. Any comments, @economon @talbring ? However, I think this should go in soon, so if there are no further comments I'll be merging in the changes by next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice work @pcarruscag - looks to be a clean and logical introduction of templating for the linear solvers. Code is still pretty easy to follow in the end.
I have made a few comments in the review about places where we will need to sync up this and PR #652.
Also, it may be time to consider some additional regression tests for the linear solvers in the different modes. Perhaps even some unit tests at some point. I do not expect that we have full coverage yet of all permutations of the linear solvers, preconditioners, smoothers, etc in both base an AD mode.
I will leave it to @talbring to comment if there are any more considerations relative to the initial work for templates in these classes.
Thanks guys. |
Looks like the commented code was added here (b5db893) but never activated. The Matrix* routines are not being used anywhere at the moment. Do you see some value in testing it out? Otherwise, might be best to remove so we aren't carrying around dead code. |
I see, I do not know what is the quickest way to invert a 5x5 matrix, most robust would probably be LU with pivoting (for which we have the code in the RBF class). Since that relates to how we handle small dense matrices I would say it relates to #643 so it would be good if the community got to a conclusion there.
|
Sounds good to me. Then, I suggest we leave the comments for now, and you can come back to it when considering #643 further (or when some performance issues are considered) in a later PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@economon I fixed the conflicts without templating CGeometry, the MPI buffers stay in su2double and the translation is done by CSysMatrix in InitiateComms and CompleteComms. Please see if everything still looks ok, also see my comment below.
Thanks,
Pedro
…pe instead of CSysMatrix template parameter
W.r.t. the translation in IntiateComms() and CompleteComms(): appears to be a pretty straightforward typecasting to take care of the templating. LGTM. |
Thank you, I am putting this to rest now then. |
Proposed Changes
This is the continuation to #650, so please compare these changes to the ones therein.
The linear systems solved in the discrete adjoint are now in passivedouble which speeds the discrete adjoint by a factor of about 1.5 (less for low CFL, more for stiff Jacobians like those in structural FEM). This is possible because we only carry the derivatives of the residual.
I tried to keep the design simple. CSysSolve and all related classes (vector, matrix, preconditioner, and matrix-vector product) have a single template parameter, the data type (which can be passivedouble or su2double). There are no provisions for "mixed" arithmetic, except in CSysSolve where through ::Solve (and only through ::Solve) one can ask for the solution of a system with passive Jacobian and active RHS and solution (this is possible at the expense of temporaries that are allocated only once). Passive-Passive and Active-Active work without temporaries and Active-Passive is not supported as it does not make sense (see end of previous paragraph).
This is to keep the need for template specialization to a minimum. Wherever mixing types was necessary small helper methods were defined to provide the compatibility instead of specializing larger methods, I think this keeps the code readable.
The place where passive and active types mix the most is CSysMatrix. This happens because the blocks are prepared by the numerics in **su2double and are then "Set", "Add", or "Subtract", on a CSysMatrix. The solution was to inline those routines and template them also for the data type of the block (or diagonal).
I only tested on one fluid adjoint and one FSI adjoint case (fingers cross not to fail too many tests).
Related Work
#594 Does not help with memory much but helps with speed.
#648 Makes it easier to interface with an external solver and still work with the discrete adjoint.
#650 Builds on top of what is proposed there.
#543 These MKL optimizations will now be possible for the discrete adjoint but I have not made them available yet.
Branch feature_template_linear_solver
PR Checklist