Hybrid Parallel AD (Part 1/?) #1214

jblueh · 2021-03-01T17:38:25Z

Proposed Changes

The goal of this PR is to establish AD support for the OpenMP features of SU2.

AD support in SU2 is provided by CoDiPack, which is coupled with MeDiPack for the differentiation of MPI. OpenMP support is added by coupling it in addition with OpDiLib so that all in all, hybrid parallel AD is achieved.

This involves at least the following steps.

Incorporate OpDiLib into the AD workflow.
Establish thread-safety and parallelization of the discrete adjoint code.
Testing of hybrid parallel adjoints.
Performance optimizations.

Related Work

OpenMP features introduced in #789, #824, #830, #834, #843, #861, #895, #975, #1178, possibly others

PR Checklist

I am submitting my contribution to the develop branch.
My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags, or simply --warnlevel=2 when using meson).
My contribution is commented and consistent with SU2 style.
I have added a test case that demonstrates my contribution, if necessary.
I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.

jblueh · 2021-03-01T18:15:50Z

The commits so far address parts of OpDiLib's incorporation into the AD workflow and sort out some additional issues along the lines of the disc_adj_fsi/Airfoild2d testcase. They provide a testing environment that is stable with respect to OpDiLib and can already be used for figuring out some of the more SU2 specific issues. However, they require that SU2 is built with a compiler with OMPT support, for example an up-to-date version of clang++, hence the failing CI builds.

pcarruscag · 2021-03-01T19:34:12Z

It looks like the builds are failing because of swig (python wrapper) and a missing definition of size_t.
What does OMPT stand for? OpenMP threads? (Google was not helpful)

jblueh · 2021-03-01T20:11:27Z

True that, I built it without the python wrapper locally. This build command worked for me.

./meson.py omptestenv --buildtype=debug --warnlevel=2 -Denable-autodiff=true -Denable-directdiff=true -Dwith-omp=true

The "T" in OMPT stands for "tools", see Chapter 4 of the OpenMP 5.1 specification. We also explain it with a focus on AD in our preprint.

pcarruscag · 2021-03-01T20:19:35Z

I see, then it should be possible to detect the version of the standard supported by the compiler and only enable the feature in that case. We do that for simd directives for compatibility (hopefully) with the MS compilers.

pcarruscag

I'll merge this soon if there are no objections.

pcarruscag · 2021-04-06T11:13:06Z

SU2_CFD/src/output/filewriter/CParallelDataSorter.cpp

-  /*--- Note, passiveDoubleBuffer and doubleBuffer point to the same address.
-   * This is the reason why we have to do the following copy/reordering in two steps. ---*/
+  /*--- Reorder the data in the buffer. ---*/

-  /*--- Step 1: Extract the underlying double value --- */
+  vector<passivedouble> tmpBuffer(nPoint_Recv[size]);

-  if (!std::is_same<su2double, passivedouble>::value){
-    for (int jj = 0; jj < VARS_PER_POINT*nPoint_Recv[size]; jj++){
-      const passivedouble tmpVal = SU2_TYPE::GetValue(doubleBuffer[jj]);
-      passiveDoubleBuffer[jj] = tmpVal;
-      /*--- For some AD datatypes a call of the destructor is
-       *  necessary to properly delete the AD type ---*/
-      doubleBuffer[jj].~su2double();
-    }
-  }
-
-  /*--- Step 2: Reorder the data in the buffer --- */


These last changes were to fix issues when the RealReverseIndex is used.
We used to have a char buffer that was used for both passive and active doubles, which required these explicit calls to the destructor, but I guess that did not work so well with mpi and all.
Now there is only passivedouble, which means that once something goes into the data sorter AD information is lost.
This would matter if some computation was performed with time averaged values, which does not seem to be the case at the moment...

pcarruscag · 2021-04-06T11:14:11Z

SU2_CFD/include/output/filewriter/CParallelDataSorter.hpp

  void SetUnsorted_Data(unsigned long iPoint, unsigned short iField, su2double data){
-    connSend[Index[iPoint] + iField] = data;
+    connSend[Index[iPoint] + iField] = SU2_TYPE::GetValue(data);
  }

-  su2double GetUnsorted_Data(unsigned long iPoint, unsigned short iField) const {
+  passivedouble GetUnsorted_Data(unsigned long iPoint, unsigned short iField) const {
    return connSend[Index[iPoint] + iField];
  }


This is were the active-passive conversion takes place.

pcarruscag · 2021-04-06T11:15:03Z

SU2_CFD/include/output/filewriter/CSurfaceFEMDataSorter.hpp

-  CFEMDataSorter* volumeSorter;                  //!< Pointer to the volume sorter instance
+  const CFEMDataSorter* volumeSorter;            //!< Pointer to the volume sorter instance


also the usual cleanups

lgtm-com · 2021-04-06T12:11:34Z

This pull request fixes 8 alerts when merging 45cc9a5 into e823be7 - view on LGTM.com

fixed alerts:

5 for Comparison of narrow type with wide type in loop condition
3 for Resource not released in destructor

TobiKattmann · 2021-04-07T15:08:33Z

SU2_CFD/src/iteration/CDiscAdjFluidIteration.cpp

-  if (solver[iZone][iInst][MESH_0][ADJFEA_SOL]) {
-    solver[iZone][iInst][MESH_0][ADJFEA_SOL]->SetRecording(geometry[iZone][iInst][MESH_0], config[iZone]);
-  }


I already thought I dreamed of removing this code bit when I scrolled through #1257 and it was gone (because it was already deleted here) ... that conditional evaluated to true btw and it does not crash for non-FEA cases . Just c++ things

jblueh added 10 commits February 28, 2021 20:02

Add OpDiLib submodule.

5fd72ca

Update meson script.

679e979

Update to thread-safe version of CoDiPack.

b4650ba

Add parallel AD type.

caa1542

Add OpDiLib bindings.

d153a00

Update AD interface.

d9ce155

Linear algebra updates.

c9ac197

Zero-initialize memory.

5074ee3

Fix CDiscAdjFEAIteration dependencies.

33437ce

Disable preaccumulation for OpenMP.

5735c0e

jblueh added 2 commits March 2, 2021 19:04

Fix python wrapper builds.

4a820f7

Fix missing definition of size_t.

a26e2be

pcarruscag added the changelog:feature label Mar 2, 2021

jblueh added 2 commits March 3, 2021 11:56

Check OMPT support.

94ac52e

Merge branch 'develop' into hybrid_parallel_ad

ce4a3bc

maxaehle mentioned this pull request Mar 4, 2021

Removed CSolver::Convective_Residual #1222

Merged

5 tasks

CoDiPack update.

7bbb9cd

pcarruscag mentioned this pull request Mar 11, 2021

Linear solver changes to support hybrid parallel AD #1228

Merged

5 tasks

jblueh added 7 commits March 11, 2021 22:21

OpDiLib update.

cfb7285

CoDiPack update.

8fc0941

Enable OpDiLib macro backend.

e04f931

Update SU2_OMP macros and introduce END macros.

1351c79

Update specialized macros.

6bf97a2

Update macros throughout the code.

aeaf251

Introduce END macros throughout the code.

5cea386

su2code deleted a comment from lgtm-com bot Mar 28, 2021

su2code deleted a comment from jblueh Mar 28, 2021

su2code deleted a comment from lgtm-com bot Mar 28, 2021

pcarruscag added 7 commits March 29, 2021 19:33

missing destruction in CSysVector

e5e3ebc

no type punning in COutput...

6483a3f

missing include

083f0b7

fix unused warning

c3a62d3

double free

92406ed

why is everything a pointer ffs...

73a575b

enough testing for now, revert RealReverseIndex to RealReverse

3870382

su2code deleted a comment from lgtm-com bot Apr 6, 2021

pcarruscag reviewed Apr 6, 2021

View reviewed changes

Merge branch 'develop' into hybrid_parallel_ad

45cc9a5

pcarruscag approved these changes Apr 6, 2021

View reviewed changes

su2code deleted a comment from lgtm-com bot Apr 6, 2021

pcarruscag merged commit 3b20982 into develop Apr 7, 2021

pcarruscag deleted the hybrid_parallel_ad branch April 7, 2021 12:57

pr-triage bot added PR: merged and removed PR: unreviewed labels Apr 7, 2021

TobiKattmann reviewed Apr 7, 2021

View reviewed changes

pr-triage bot added PR: unreviewed and removed PR: merged labels Apr 7, 2021

pcarruscag mentioned this pull request May 9, 2021

Hybrid Parallel AD (Part 2/?) #1284

Merged

5 tasks

jblueh mentioned this pull request May 22, 2023

Hybrid Parallel AD Performance Improvements #2039

Merged

5 tasks

jblueh mentioned this pull request Feb 13, 2024

Add further parallel regions #2208

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hybrid Parallel AD (Part 1/?) #1214

Hybrid Parallel AD (Part 1/?) #1214

jblueh commented Mar 1, 2021

jblueh commented Mar 1, 2021

pcarruscag commented Mar 1, 2021

jblueh commented Mar 1, 2021

pcarruscag commented Mar 1, 2021

pcarruscag left a comment

pcarruscag Apr 6, 2021

pcarruscag Apr 6, 2021

pcarruscag Apr 6, 2021

lgtm-com bot commented Apr 6, 2021

TobiKattmann Apr 7, 2021

		CFEMDataSorter* volumeSorter; //!< Pointer to the volume sorter instance
		const CFEMDataSorter* volumeSorter; //!< Pointer to the volume sorter instance

Hybrid Parallel AD (Part 1/?) #1214

Hybrid Parallel AD (Part 1/?) #1214

Conversation

jblueh commented Mar 1, 2021

Proposed Changes

Related Work

PR Checklist

jblueh commented Mar 1, 2021

pcarruscag commented Mar 1, 2021

jblueh commented Mar 1, 2021

pcarruscag commented Mar 1, 2021

pcarruscag left a comment

Choose a reason for hiding this comment

pcarruscag Apr 6, 2021

Choose a reason for hiding this comment

pcarruscag Apr 6, 2021

Choose a reason for hiding this comment

pcarruscag Apr 6, 2021

Choose a reason for hiding this comment

lgtm-com bot commented Apr 6, 2021

TobiKattmann Apr 7, 2021

Choose a reason for hiding this comment