Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid Parallel AD (Part 1/?) #1214

Merged
merged 66 commits into from
Apr 7, 2021
Merged
Show file tree
Hide file tree
Changes from 65 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
5fd72ca
Add OpDiLib submodule.
jblueh Feb 28, 2021
679e979
Update meson script.
jblueh Feb 28, 2021
b4650ba
Update to thread-safe version of CoDiPack.
jblueh Feb 28, 2021
caa1542
Add parallel AD type.
jblueh Feb 28, 2021
d153a00
Add OpDiLib bindings.
jblueh Feb 28, 2021
d9ce155
Update AD interface.
jblueh Mar 1, 2021
c9ac197
Linear algebra updates.
jblueh Mar 1, 2021
5074ee3
Zero-initialize memory.
jblueh Mar 1, 2021
33437ce
Fix CDiscAdjFEAIteration dependencies.
jblueh Mar 1, 2021
5735c0e
Disable preaccumulation for OpenMP.
jblueh Mar 1, 2021
4a820f7
Fix python wrapper builds.
jblueh Mar 2, 2021
a26e2be
Fix missing definition of size_t.
jblueh Mar 2, 2021
94ac52e
Check OMPT support.
jblueh Mar 3, 2021
ce4a3bc
Merge branch 'develop' into hybrid_parallel_ad
jblueh Mar 3, 2021
7bbb9cd
CoDiPack update.
jblueh Mar 8, 2021
cfb7285
OpDiLib update.
jblueh Mar 11, 2021
8fc0941
CoDiPack update.
jblueh Mar 11, 2021
e04f931
Enable OpDiLib macro backend.
jblueh Mar 11, 2021
1351c79
Update SU2_OMP macros and introduce END macros.
jblueh Mar 11, 2021
6bf97a2
Update specialized macros.
jblueh Mar 11, 2021
aeaf251
Update macros throughout the code.
jblueh Mar 11, 2021
5cea386
Introduce END macros throughout the code.
jblueh Mar 11, 2021
a440085
Merge remote-tracking branch 'upstream/develop' into hybrid_parallel_ad
pcarruscag Mar 15, 2021
223c10d
Recover CoDiPack version.
jblueh Mar 17, 2021
6775b29
OpDiLib update.
jblueh Mar 17, 2021
6aaebca
Add syntax file.
jblueh Mar 17, 2021
ce44cac
Fix missing END macros.
jblueh Mar 17, 2021
ef7ad26
Merge branch 'linsol_fixes' into hybrid_parallel_ad
pcarruscag Mar 17, 2021
f093b35
move MASTER out of ExtFunc functions, parallel copy in CSysSolve_b
pcarruscag Mar 17, 2021
e174bac
move master into some solver methods
pcarruscag Mar 17, 2021
2182622
try to have less "end master"
pcarruscag Mar 17, 2021
f71b9ec
Merge remote-tracking branch 'upstream/develop' into hybrid_parallel_ad
pcarruscag Mar 17, 2021
94dafb4
omp directives in DiscAdjSolver
pcarruscag Mar 19, 2021
8eb3094
fixes
pcarruscag Mar 19, 2021
66d51df
Merge remote-tracking branch 'upstream/develop' into hybrid_parallel_ad
pcarruscag Mar 19, 2021
a7fbcd6
more year updates
pcarruscag Mar 19, 2021
2776775
dead code
pcarruscag Mar 19, 2021
74f20c4
more cleanup
pcarruscag Mar 19, 2021
63003ee
val_ never made anything better, parallel dependencies, fix adjoint r…
pcarruscag Mar 19, 2021
b329cb6
more parallel, fix SensGeo output
pcarruscag Mar 19, 2021
02c9c8e
mesh solver, plus some cleanup
pcarruscag Mar 19, 2021
ac5c581
no include of cpp
pcarruscag Mar 20, 2021
3527c28
fix bug from nested parallel region
pcarruscag Mar 22, 2021
b7d3a8e
Merge branch 'develop' into hybrid_parallel_ad
pcarruscag Mar 22, 2021
9efa995
less boilerplate, more boilerplate, fix merge, try to fix failed regr…
pcarruscag Mar 23, 2021
83b032b
prepare CDiscAdjFEASolver
pcarruscag Mar 23, 2021
7465871
simplify
pcarruscag Mar 23, 2021
ecb64d0
Allow OpDiLib backend choice.
jblueh Mar 23, 2021
8b4a89c
OpDiLib update.
jblueh Mar 25, 2021
165a52b
Add AD build tests.
jblueh Mar 25, 2021
f63286b
Merge remote-tracking branch 'upstream/develop' into hybrid_parallel_ad
pcarruscag Mar 25, 2021
60792dc
Disable normal builds in AD builds tests.
jblueh Mar 25, 2021
3b0854b
add syntax check to meson for OpenMP+AD builds
pcarruscag Mar 25, 2021
67bdd31
Merge branch 'hybrid_parallel_ad' of https://github.com/su2code/SU2 i…
pcarruscag Mar 25, 2021
4738e29
OpDiLib update.
jblueh Mar 25, 2021
7e0bc67
Fix include.
jblueh Mar 26, 2021
3e82662
explicit construction and destruction of non trivial types in C2DCont…
pcarruscag Mar 26, 2021
c82f3c7
test preaccumulation with RealReverseIndex
pcarruscag Mar 28, 2021
e5e3ebc
missing destruction in CSysVector
pcarruscag Mar 29, 2021
6483a3f
no type punning in COutput...
pcarruscag Mar 29, 2021
083f0b7
missing include
pcarruscag Mar 29, 2021
c3a62d3
fix unused warning
pcarruscag Mar 29, 2021
92406ed
double free
pcarruscag Mar 29, 2021
73a575b
why is everything a pointer ffs...
pcarruscag Mar 30, 2021
3870382
enough testing for now, revert RealReverseIndex to RealReverse
pcarruscag Mar 30, 2021
45cc9a5
Merge branch 'develop' into hybrid_parallel_ad
pcarruscag Apr 6, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .github/workflows/regression.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
strategy:
fail-fast: false
matrix:
config_set: [BaseMPI, ReverseMPI, ForwardMPI, BaseNoMPI, ReverseNoMPI, ForwardNoMPI, BaseOMP]
config_set: [BaseMPI, ReverseMPI, ForwardMPI, BaseNoMPI, ReverseNoMPI, ForwardNoMPI, BaseOMP, ReverseOMP, ForwardOMP]
include:
- config_set: BaseMPI
flags: '-Denable-pywrapper=true -Denable-tests=true --warnlevel=3 --werror'
Expand All @@ -32,6 +32,10 @@ jobs:
flags: '-Denable-directdiff=true -Denable-normal=false -Dwith-mpi=disabled -Denable-tests=true --warnlevel=3 --werror'
- config_set: BaseOMP
flags: '-Dwith-omp=true -Denable-mixedprec=true -Denable-tecio=false --warnlevel=3 --werror'
- config_set: ReverseOMP
flags: '-Denable-autodiff=true -Denable-normal=false -Dwith-omp=true -Denable-mixedprec=true -Denable-tecio=false --warnlevel=3 --werror'
- config_set: ForwardOMP
flags: '-Denable-directdiff=true -Denable-normal=false -Dwith-omp=true -Denable-mixedprec=true -Denable-tecio=false --warnlevel=3 --werror'
runs-on: ubuntu-latest
steps:
- name: Cache Object Files
Expand Down
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,6 @@
[submodule "subprojects/Mutationpp"]
path = subprojects/Mutationpp
url = https://github.com/mutationpp/Mutationpp.git
[submodule "externals/opdi"]
path = externals/opdi
url = https://github.com/SciCompKL/OpDiLib
104 changes: 66 additions & 38 deletions Common/include/basic_types/ad_structure.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*!
* \file ad_structure.hpp
* \brief Main routines for the algorithmic differentiation (AD) structure.
* \author T. Albring
* \author T. Albring, J. Blühdorn
* \version 7.1.1 "Blackbird"
*
* SU2 Project Website: https://su2code.github.io
Expand All @@ -27,7 +27,8 @@

#pragma once

#include "datatype_structure.hpp"
#include "../code_config.hpp"
#include "../parallelization/omp_structure.hpp"

/*!
* \namespace AD
Expand Down Expand Up @@ -278,62 +279,92 @@ namespace AD{

extern int adjointVectorPosition;

/*--- Reference to the tape ---*/

extern su2double::TapeType& globalTape;

extern bool Status;

extern bool PreaccActive;

extern bool PreaccEnabled;

extern su2double::TapeType::Position StartPosition, EndPosition;
#ifdef HAVE_OPDI
using CoDiTapePosition = su2double::TapeType::Position;
using OpDiState = void*;
using TapePosition = std::pair<CoDiTapePosition, OpDiState>;
#else
using TapePosition = su2double::TapeType::Position;
#endif
Comment on lines +288 to +294
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Positions in the AD recording are now identified by the CoDiPack tape position together with the corresponding OpDiLib state.


extern TapePosition StartPosition, EndPosition;

extern std::vector<su2double::TapeType::Position> TapePositions;
extern std::vector<TapePosition> TapePositions;

extern std::vector<su2double::GradientData> localInputValues;

extern std::vector<su2double*> localOutputValues;

extern codi::PreaccumulationHelper<su2double> PreaccHelper;

/*--- Reference to the tape. ---*/

FORCEINLINE su2double::TapeType& getGlobalTape() {
return su2double::getGlobalTape();
}

Comment on lines +306 to +311
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tapes may change during runtime and different threads use different tapes. Hence, each reference to the tape must be resolved dynamically via this call.

FORCEINLINE void RegisterInput(su2double &data, bool push_index = true) {
AD::globalTape.registerInput(data);
AD::getGlobalTape().registerInput(data);
if (push_index) {
inputValues.push_back(data.getGradientData());
}
}

FORCEINLINE void RegisterOutput(su2double& data) {AD::globalTape.registerOutput(data);}
FORCEINLINE void RegisterOutput(su2double& data) {AD::getGlobalTape().registerOutput(data);}

FORCEINLINE void ResetInput(su2double &data) {data.getGradientData() = su2double::GradientData();}

FORCEINLINE void StartRecording() {AD::globalTape.setActive();}
FORCEINLINE void StartRecording() {AD::getGlobalTape().setActive();}

FORCEINLINE void StopRecording() {AD::globalTape.setPassive();}
FORCEINLINE void StopRecording() {AD::getGlobalTape().setPassive();}

FORCEINLINE bool TapeActive() { return AD::globalTape.isActive(); }
FORCEINLINE bool TapeActive() { return AD::getGlobalTape().isActive(); }

FORCEINLINE void PrintStatistics() {AD::globalTape.printStatistics();}
FORCEINLINE void PrintStatistics() {AD::getGlobalTape().printStatistics();}

FORCEINLINE void ClearAdjoints() {AD::globalTape.clearAdjoints(); }
FORCEINLINE void ClearAdjoints() {AD::getGlobalTape().clearAdjoints(); }

FORCEINLINE void ComputeAdjoint() {AD::globalTape.evaluate(); adjointVectorPosition = 0;}
FORCEINLINE void ComputeAdjoint() {
#if defined(HAVE_OPDI)
opdi::logic->prepareEvaluate();
#endif
AD::getGlobalTape().evaluate();
adjointVectorPosition = 0;
}

FORCEINLINE void ComputeAdjoint(unsigned short enter, unsigned short leave) {
AD::globalTape.evaluate(TapePositions[enter], TapePositions[leave]);
#if defined(HAVE_OPDI)
opdi::logic->recoverState(TapePositions[enter].second);
opdi::logic->prepareEvaluate();
AD::getGlobalTape().evaluate(TapePositions[enter].first, TapePositions[leave].first);
#else
AD::getGlobalTape().evaluate(TapePositions[enter], TapePositions[leave]);
#endif
Comment on lines +342 to +348
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AD workflow is extended by OpDiLib calls.

if (leave == 0)
adjointVectorPosition = 0;
}

FORCEINLINE void Reset() {
globalTape.reset();
AD::getGlobalTape().reset();
#if defined(HAVE_OPDI)
opdi::logic->reset();
#endif
if (inputValues.size() != 0) {
adjointVectorPosition = 0;
inputValues.clear();
}
if (TapePositions.size() != 0) {
#if defined(HAVE_OPDI)
for (TapePosition& pos : TapePositions) {
opdi::logic->freeState(pos.second);
}
#endif
TapePositions.clear();
}
}
Expand All @@ -343,11 +374,11 @@ namespace AD{
}

FORCEINLINE void SetDerivative(int index, const double val) {
AD::globalTape.setGradient(index, val);
AD::getGlobalTape().setGradient(index, val);
}

FORCEINLINE double GetDerivative(int index) {
return AD::globalTape.getGradient(index);
return AD::getGlobalTape().getGradient(index);
}

/*--- Base case for parameter pack expansion. ---*/
Expand All @@ -361,6 +392,11 @@ namespace AD{
SetPreaccIn(moreData...);
}

template<class T, class... Ts, su2enable_if<std::is_same<T,su2double>::value> = 0>
FORCEINLINE void SetPreaccIn(T&& data, Ts&&... moreData) {
static_assert(!std::is_same<T,su2double>::value, "rvalues cannot be registered");
}

template<class T>
FORCEINLINE void SetPreaccIn(const T& data, const int size) {
if (PreaccActive) {
Expand All @@ -384,20 +420,8 @@ namespace AD{
}
}

template<class T>
FORCEINLINE void SetPreaccIn(const T& data, const int size_x, const int size_y, const int size_z) {
if (!PreaccActive) return;
for (int i = 0; i < size_x; i++) {
for (int j = 0; j < size_y; j++) {
for (int k = 0; k < size_z; k++) {
if (data[i][j][k].isActive()) PreaccHelper.addInput(data[i][j][k]);
}
}
}
}

FORCEINLINE void StartPreacc() {
if (globalTape.isActive() && PreaccEnabled) {
if (AD::getGlobalTape().isActive() && PreaccEnabled) {
PreaccHelper.start();
PreaccActive = true;
}
Expand Down Expand Up @@ -438,7 +462,11 @@ namespace AD{
}

FORCEINLINE void Push_TapePosition() {
TapePositions.push_back(AD::globalTape.getPosition());
#if defined(HAVE_OPDI)
TapePositions.push_back({AD::getGlobalTape().getPosition(), opdi::logic->exportState()});
#else
TapePositions.push_back(AD::getGlobalTape().getPosition());
#endif
}

FORCEINLINE void EndPreacc(){
Expand Down Expand Up @@ -478,15 +506,15 @@ namespace AD{
}

FORCEINLINE void SetExtFuncOut(su2double& data) {
if (globalTape.isActive()) {
if (AD::getGlobalTape().isActive()) {
FuncHelper->addOutput(data);
}
}

template<class T>
FORCEINLINE void SetExtFuncOut(T&& data, const int size) {
for (int i = 0; i < size; i++) {
if (globalTape.isActive()) {
if (AD::getGlobalTape().isActive()) {
FuncHelper->addOutput(data[i]);
}
}
Expand All @@ -496,7 +524,7 @@ namespace AD{
FORCEINLINE void SetExtFuncOut(T&& data, const int size_x, const int size_y) {
for (int i = 0; i < size_x; i++) {
for (int j = 0; j < size_y; j++) {
if (globalTape.isActive()) {
if (AD::getGlobalTape().isActive()) {
FuncHelper->addOutput(data[i][j]);
}
}
Expand All @@ -511,7 +539,7 @@ namespace AD{
FORCEINLINE void EndExtFunc() { delete FuncHelper; }

FORCEINLINE bool BeginPassive() {
if(AD::globalTape.isActive()) {
if(AD::getGlobalTape().isActive()) {
StopRecording();
return true;
}
Expand Down
83 changes: 3 additions & 80 deletions Common/include/basic_types/datatype_structure.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,87 +30,10 @@
#include <iostream>
#include <complex>
#include <cstdio>
#include <type_traits>

#if defined(_MSC_VER)
#define FORCEINLINE __forceinline
#elif defined(__GNUC__) || defined(__clang__) || defined(__INTEL_COMPILER)
#define FORCEINLINE inline __attribute__((always_inline))
#else
#define FORCEINLINE inline
#endif

#if defined(__GNUC__) || defined(__clang__) || defined(__INTEL_COMPILER)
#define NEVERINLINE inline __attribute__((noinline))
#else
#define NEVERINLINE inline
#endif

#if defined(__INTEL_COMPILER)
/*--- Disable warnings related to inline attributes. ---*/
#pragma warning disable 2196
#pragma warning disable 3415
/*--- Disable warnings related to overloaded virtual. ---*/
#pragma warning disable 654
#pragma warning disable 1125
#if defined(CODI_FORWARD_TYPE) || defined(CODI_REVERSE_TYPE)
#pragma warning disable 1875
#endif
#endif

/*--- Convenience SFINAE typedef to conditionally
* enable/disable function template overloads. ---*/
template<bool condition>
using su2enable_if = typename std::enable_if<condition,bool>::type;

/*--- Depending on the datatype defined during the configuration,
* include the correct definition, and create the main typedef. ---*/

#if defined(CODI_REVERSE_TYPE) // reverse mode AD
#include "codi.hpp"
#include "codi/tools/dataStore.hpp"

#ifndef CODI_INDEX_TAPE
#define CODI_INDEX_TAPE 0
#endif
#ifndef CODI_PRIMAL_TAPE
#define CODI_PRIMAL_TAPE 0
#endif
#ifndef CODI_PRIMAL_INDEX_TAPE
#define CODI_PRIMAL_INDEX_TAPE 0
#endif

#if CODI_INDEX_TAPE
using su2double = codi::RealReverseIndex;
#elif CODI_PRIMAL_TAPE
using su2double = codi::RealReversePrimal;
#elif CODI_PRIMAL_INDEX_TAPE
using su2double = codi::RealReversePrimalIndex;
#else
using su2double = codi::RealReverse;
#endif

#elif defined(CODI_FORWARD_TYPE) // forward mode AD
#include "codi.hpp"
using su2double = codi::RealForward;

#else // primal / direct / no AD
using su2double = double;
#endif

#include "../code_config.hpp"
#include "ad_structure.hpp"

/*--- This type can be used for (rare) compatiblity cases or for
* computations that are intended to be (always) passive. ---*/
using passivedouble = double;

/*--- Define a type for potentially lower precision operations. ---*/
#ifdef USE_MIXED_PRECISION
using su2mixedfloat = float;
#else
using su2mixedfloat = passivedouble;
#endif

/*!
* \namespace SU2_TYPE
* \brief Namespace for defining the datatype wrapper routines, this acts as a base
Expand Down Expand Up @@ -174,11 +97,11 @@ namespace SU2_TYPE {

#ifdef CODI_REVERSE_TYPE
FORCEINLINE passivedouble GetSecondary(const su2double& data) {
return AD::globalTape.getGradient(AD::inputValues[AD::adjointVectorPosition++]);
return AD::getGlobalTape().getGradient(AD::inputValues[AD::adjointVectorPosition++]);
}

FORCEINLINE passivedouble GetDerivative(const su2double& data) {
return AD::globalTape.getGradient(AD::inputValues[AD::adjointVectorPosition++]);
return AD::getGlobalTape().getGradient(AD::inputValues[AD::adjointVectorPosition++]);
}
#else // forward
FORCEINLINE passivedouble GetSecondary(const su2double& data) {return data.getGradient();}
Expand Down
Loading