Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased performance of the discrete adjoint solver by using Templates for Linear Solvers #653

Merged
merged 23 commits into from
May 1, 2019

Conversation

pcarruscag
Copy link
Member

Proposed Changes

This is the continuation to #650, so please compare these changes to the ones therein.

The linear systems solved in the discrete adjoint are now in passivedouble which speeds the discrete adjoint by a factor of about 1.5 (less for low CFL, more for stiff Jacobians like those in structural FEM). This is possible because we only carry the derivatives of the residual.

I tried to keep the design simple. CSysSolve and all related classes (vector, matrix, preconditioner, and matrix-vector product) have a single template parameter, the data type (which can be passivedouble or su2double). There are no provisions for "mixed" arithmetic, except in CSysSolve where through ::Solve (and only through ::Solve) one can ask for the solution of a system with passive Jacobian and active RHS and solution (this is possible at the expense of temporaries that are allocated only once). Passive-Passive and Active-Active work without temporaries and Active-Passive is not supported as it does not make sense (see end of previous paragraph).

This is to keep the need for template specialization to a minimum. Wherever mixing types was necessary small helper methods were defined to provide the compatibility instead of specializing larger methods, I think this keeps the code readable.

The place where passive and active types mix the most is CSysMatrix. This happens because the blocks are prepared by the numerics in **su2double and are then "Set", "Add", or "Subtract", on a CSysMatrix. The solution was to inline those routines and template them also for the data type of the block (or diagonal).

I only tested on one fluid adjoint and one FSI adjoint case (fingers cross not to fail too many tests).

Related Work

#594 Does not help with memory much but helps with speed.
#648 Makes it easier to interface with an external solver and still work with the discrete adjoint.
#650 Builds on top of what is proposed there.
#543 These MKL optimizations will now be possible for the discrete adjoint but I have not made them available yet.
Branch feature_template_linear_solver

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags).
  • My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.

@talbring
Copy link
Member

@pcarruscag Thanks for taking a lead on that and picking that up! I wanted to do that for a very long time, but never had enough time 👍 I will have a look into it soon.

@pcarruscag pcarruscag mentioned this pull request Apr 4, 2019
4 tasks
@rsanfer rsanfer requested review from economon and talbring and removed request for economon April 16, 2019 06:59
@rsanfer
Copy link
Contributor

rsanfer commented Apr 16, 2019

This has been sitting here for way too long; my apologies, @pcarruscag . Thanks for taking the lead and making the effort. The changes look good to me, but I'm no expert in templating myself, so I'd rather have somebody else with more experience in the topic approving it. Any comments, @economon @talbring ? However, I think this should go in soon, so if there are no further comments I'll be merging in the changes by next week.

Copy link
Member

@economon economon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work @pcarruscag - looks to be a clean and logical introduction of templating for the linear solvers. Code is still pretty easy to follow in the end.

I have made a few comments in the review about places where we will need to sync up this and PR #652.

Also, it may be time to consider some additional regression tests for the linear solvers in the different modes. Perhaps even some unit tests at some point. I do not expect that we have full coverage yet of all permutations of the linear solvers, preconditioners, smoothers, etc in both base an AD mode.

I will leave it to @talbring to comment if there are any more considerations relative to the initial work for templates in these classes.

Common/src/linear_solvers_structure.cpp Outdated Show resolved Hide resolved
Common/src/linear_solvers_structure.cpp Outdated Show resolved Hide resolved
Common/src/matrix_structure.cpp Show resolved Hide resolved
Common/src/matrix_structure.cpp Show resolved Hide resolved
Common/src/matrix_structure.cpp Show resolved Hide resolved
@pcarruscag
Copy link
Member Author

Thanks guys.
@economon, regarding the comments, I am happy to remove them if that is ok with who put them there.
Regarding the conflicts I noticed them when first reading through 652, the first two you mention are easy to solve, the one about moving the comms will probably require templating the associated method, I can help with the merge.
Cheers,
Pedro

@economon
Copy link
Member

Looks like the commented code was added here (b5db893) but never activated. The Matrix* routines are not being used anywhere at the moment. Do you see some value in testing it out? Otherwise, might be best to remove so we aren't carrying around dead code.

@pcarruscag
Copy link
Member Author

I see, I do not know what is the quickest way to invert a 5x5 matrix, most robust would probably be LU with pivoting (for which we have the code in the RBF class). Since that relates to how we handle small dense matrices I would say it relates to #643 so it would be good if the community got to a conclusion there.
In any case I want this PR to be only about templating, I can do that kind of cleanup when I:

@economon
Copy link
Member

Sounds good to me. Then, I suggest we leave the comments for now, and you can come back to it when considering #643 further (or when some performance issues are considered) in a later PR.

Copy link
Member Author

@pcarruscag pcarruscag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@economon I fixed the conflicts without templating CGeometry, the MPI buffers stay in su2double and the translation is done by CSysMatrix in InitiateComms and CompleteComms. Please see if everything still looks ok, also see my comment below.
Thanks,
Pedro

Common/src/linear_solvers_structure.cpp Outdated Show resolved Hide resolved
@economon
Copy link
Member

W.r.t. the translation in IntiateComms() and CompleteComms(): appears to be a pretty straightforward typecasting to take care of the templating. LGTM.

@pcarruscag
Copy link
Member Author

Thank you, I am putting this to rest now then.

@pcarruscag pcarruscag merged commit 28e634a into su2code:develop May 1, 2019
@talbring talbring changed the title Templated Linear Solvers Templated Linear Solvers to increase performance of discrete adjoint solver Nov 8, 2019
@pr-triage pr-triage bot added the PR: merged label Nov 8, 2019
@talbring talbring changed the title Templated Linear Solvers to increase performance of discrete adjoint solver Increase performance of discrete adjoint solver by using Templates for Linear Solvers Nov 8, 2019
@talbring talbring changed the title Increase performance of discrete adjoint solver by using Templates for Linear Solvers Increased performance of the discrete adjoint solver by using Templates for Linear Solvers Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants