Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorized (SIMD) Numerical Schemes #1022

Merged
merged 112 commits into from
Sep 30, 2020
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
82b9edf
add basic simd type
pcarruscag Jun 10, 2020
058a5b8
begin prototype of simd numerics
pcarruscag Jun 14, 2020
2d2961a
Merge branch 'feature_quasi_newton_adjoint' into feature_simd_numerics
pcarruscag Jun 17, 2020
1bfea72
Merge branch 'iteration_class' into feature_simd_numerics
pcarruscag Jun 17, 2020
e41c6f6
optimize least squares gradients when periodic comms are not needed
pcarruscag Jun 17, 2020
07d325f
use CRTP for static polymorphism
pcarruscag Jun 17, 2020
8eeac65
fix search/replace mistakes
pcarruscag Jun 17, 2020
c24cd1c
fix LS gradients preacc
pcarruscag Jun 17, 2020
20a7edc
add iterators to C2DContainer, fix compiler errors
pcarruscag Jun 18, 2020
b72ac57
Merge branch 'feature_quasi_newton_adjoint' into feature_simd_numerics
pcarruscag Jun 18, 2020
d43c6ad
add SIMD set methods to CSysVector and CSysMatrix, fix 1000 compilati…
pcarruscag Jun 18, 2020
ff0ea0b
codefactor
pcarruscag Jun 18, 2020
7e47d1a
make new numerics compatible with non-SIMD types (for AD)
pcarruscag Jun 19, 2020
06956b6
fetching edge nodes needs gather due to coloring, add C3DContainerDec…
pcarruscag Jun 20, 2020
112bf04
improving and cleaning re-orientation checks
pcarruscag Jun 20, 2020
b2db9ba
optimize least squares gradients when periodic comms are not needed
pcarruscag Jun 17, 2020
a711554
fix LS gradients preacc
pcarruscag Jun 17, 2020
d905213
Merge branch 'cleanup_orientation_checks' into feature_simd_numerics
pcarruscag Jun 20, 2020
80b9453
Merge branch 'cleanup_orientation_checks' into feature_simd_numerics
pcarruscag Jun 21, 2020
c2b7049
use scale factor in vector and matrix updates as a mask to handle "re…
pcarruscag Jun 22, 2020
88a6c33
template mechanism for static decorator pattern
pcarruscag Jun 22, 2020
5fedf08
small LS cleanups and comments
pcarruscag Jun 22, 2020
6054e08
small LS cleanups and comments
pcarruscag Jun 22, 2020
4d88298
Merge branch 'cleanup_orientation_checks' into feature_simd_numerics
pcarruscag Jun 22, 2020
691df95
fix UB, 0*Nan is Nan
pcarruscag Jun 22, 2020
4f42705
need explicit SIMD directive for gcc to vectorize
pcarruscag Jun 24, 2020
31bf94a
fix a couple bugs, replace iterators by "bulk get" methods (gather is…
pcarruscag Jun 24, 2020
e01268b
hack to run simd numerics for simple Euler+Roe problems
pcarruscag Jun 24, 2020
22cd824
fix a bug, wrong vector used for reduction strategy
pcarruscag Jun 24, 2020
1c8c66d
viscous numerics implementation
pcarruscag Jun 26, 2020
97328c9
Merge branch 'feature_quasi_newton_adjoint' into feature_simd_numerics
pcarruscag Jun 26, 2020
0c52649
reduce duplication between 2D and 3D containers
pcarruscag Jun 26, 2020
6078a64
Merge branch 'feature_quasi_newton_adjoint' into feature_simd_numerics
pcarruscag Jun 26, 2020
ac0853f
Merge branch 'feature_quasi_newton_adjoint' into feature_simd_numerics
pcarruscag Jun 26, 2020
f9f743b
i/j update in CSysVector, fix a bug in viscous flux jacobian
pcarruscag Jun 27, 2020
a4b0228
AVX specialization for Array of 4 doubles
pcarruscag Jun 27, 2020
20cd051
small manual optim of some functions, remove get prefix from functions
pcarruscag Jun 27, 2020
2b9e5bb
make a directory for container types
pcarruscag Jun 27, 2020
debea7a
implement low dissip Roe and QCR in viscous fluxes
pcarruscag Jun 28, 2020
3cded55
split files, factory method
pcarruscag Jun 29, 2020
4c26d72
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jun 29, 2020
23c2a93
fix overload resolution issues in gcc 5
pcarruscag Jun 30, 2020
0dec543
Merge branch 'small_least_squares_improvement' into feature_simd_nume…
pcarruscag Jul 2, 2020
0a060b0
Merge branch 'cleanup_orientation_checks' into feature_simd_numerics
pcarruscag Jul 2, 2020
2d2ae09
optimize gradients and limiters for 2D 3D, reduce Rmatrix storage ove…
pcarruscag Jul 2, 2020
5820eb3
fix bug, clean a few lines
pcarruscag Jul 2, 2020
bdc61d3
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 2, 2020
446c042
fix ILU allocation logic for DEF/DOT
pcarruscag Jul 3, 2020
cc3ac7d
assume contiguous edges when loading iPoint jPoint
pcarruscag Jul 3, 2020
835f903
a little range for syntax sugar
pcarruscag Jul 3, 2020
c3d125a
small bug GetPoints -> GetEdges
pcarruscag Jul 4, 2020
acc8035
little cleanup
pcarruscag Jul 5, 2020
4783798
cleanup MKL JIT if def's
pcarruscag Jul 6, 2020
de0eb13
automatic preaccumulation when gathering variables
pcarruscag Jul 6, 2020
03f3a9d
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 6, 2020
12b80c6
option to use vectorization
pcarruscag Jul 6, 2020
9f71420
clean up of mixed precision ifdefs, replaced by sfinae
pcarruscag Jul 7, 2020
af01f5a
gradient unit tests
pcarruscag Jul 7, 2020
56a2706
missing file in previous commit
pcarruscag Jul 7, 2020
0d931b3
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 7, 2020
891fa3d
change some tests to use vectorization, tweak LS grads a bit more for…
pcarruscag Jul 7, 2020
a6d0b5f
one AD testcase
pcarruscag Jul 8, 2020
c7b65ec
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 9, 2020
68cb364
update tests
pcarruscag Jul 9, 2020
f23f4b6
avx512 specialization, general case using expression templates
pcarruscag Jul 10, 2020
f8a973a
use expression templates in CSysVector
pcarruscag Jul 10, 2020
540e58a
fix mem leak, format code
pcarruscag Jul 11, 2020
b1a10d8
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 11, 2020
9be871e
allow default construction of CSysVector in parallel
pcarruscag Jul 12, 2020
b0dfd9c
use the simd type also for AD
pcarruscag Jul 14, 2020
20d616d
fix directdiff build
pcarruscag Jul 14, 2020
248acea
fix icc compilation issue
pcarruscag Jul 17, 2020
9ae8974
cleanup enable_if syntax, SSE SIMD specialization, make vector expres…
pcarruscag Jul 19, 2020
c36c722
fix unit test build
pcarruscag Jul 19, 2020
7183a09
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 19, 2020
8cd90f8
change a name
pcarruscag Jul 20, 2020
26bafaf
Merge remote-tracking branch 'jblueh/codi_medi_update' into feature_s…
pcarruscag Jul 20, 2020
80f33c0
vectorized central schemes, cleanup static polymorphism mechanism
pcarruscag Jul 20, 2020
4e42b7c
optimize muscl logic to allow data reuse when computing viscous fluxe…
pcarruscag Jul 21, 2020
2f43aff
Merge branch 'fix_jst_ke' into feature_simd_numerics
pcarruscag Jul 21, 2020
a81a76d
Merge branch 'cleanup_flow_solver_duplication' into feature_simd_nume…
pcarruscag Jul 21, 2020
9932c05
implement logical SIMD operations
pcarruscag Jul 22, 2020
5f1af74
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 22, 2020
33e5445
unnecessary cast
pcarruscag Jul 22, 2020
230b047
move some methods to FVMBase
pcarruscag Jul 23, 2020
0491d4b
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 23, 2020
5554134
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Jul 24, 2020
5a5778b
fix issue with min/max expressions, unit tests for SIMD-type
pcarruscag Jul 24, 2020
f01c820
fix AD build
pcarruscag Jul 24, 2020
1623055
JST scheme with matrix dissipation
pcarruscag Jul 27, 2020
b288045
add config options
pcarruscag Jul 27, 2020
3743af9
fix bug with instantiation of Lax
pcarruscag Jul 27, 2020
0031e79
Merge branch 'feature_simd_numerics' into feature_jst_matrix
pcarruscag Jul 27, 2020
826e330
Merge branch 'develop' into feature_simd_numerics
pcarruscag Aug 3, 2020
d4602a7
fix leak introduced in #877
pcarruscag Aug 4, 2020
87b2d94
Merge branch 'feature_simd_numerics' into feature_jst_matrix
pcarruscag Aug 4, 2020
f57f4b5
add topology outputs
pcarruscag Aug 7, 2020
4ea0628
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Aug 7, 2020
6621d1d
fix clang issues
pcarruscag Aug 8, 2020
1f8bddb
Merge branch 'feature_simd_numerics' into feature_jst_matrix
pcarruscag Aug 8, 2020
4ced288
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Aug 11, 2020
2efb124
fix clang debug AD build issue
pcarruscag Aug 11, 2020
beb8688
re update testcases after merge with develop
pcarruscag Aug 11, 2020
10b56e9
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Sep 4, 2020
94cd99d
Merge branch 'feature_simd_numerics' into feature_jst_matrix
pcarruscag Sep 4, 2020
c98b530
Merge branch 'develop' into feature_simd_numerics
pcarruscag Sep 5, 2020
802d3ef
Merge branch 'develop' into feature_simd_numerics
pcarruscag Sep 7, 2020
1414f3f
Merge branch 'feature_simd_numerics' into feature_jst_matrix
pcarruscag Sep 8, 2020
99988b5
Merge remote-tracking branch 'upstream/develop' into feature_jst_matrix
pcarruscag Sep 27, 2020
d43a50b
address PR comments, fix iDim==3 issues
pcarruscag Sep 27, 2020
bd8d88f
Merge remote-tracking branch 'upstream/develop' into feature_simd_num…
pcarruscag Sep 27, 2020
6ea6116
update config_template
pcarruscag Sep 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 19 additions & 5 deletions Common/include/geometry/dual_grid/CEdge.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,13 @@
* \author F. Palacios
*/
class CEdge {
static_assert(su2activematrix::Storage == StorageType::RowMajor, "Needed to return normal as pointer.");

static_assert(su2activematrix::IsRowMajor, "Needed to return normal as pointer.");
private:
su2matrix<unsigned long> Nodes; /*!< \brief Vector to store the node indices of the edge. */
su2activematrix Normal; /*!< \brief Normal (area) of the edge. */
su2activematrix Coord_CG; /*!< \brief Center-of-gravity (mid point) of the edge. */
using Index = unsigned long;
using NodeArray = C2DContainer<Index, Index, StorageType::ColumnMajor, 64, DynamicSize, 2>;
NodeArray Nodes; /*!< \brief Vector to store the node indices of the edge. */
su2activematrix Normal; /*!< \brief Normal (area) of the edge. */
su2activematrix Coord_CG; /*!< \brief Center-of-gravity (mid point) of the edge. */

public:
enum NodePosition : unsigned long {LEFT = 0, RIGHT = 1};
Expand Down Expand Up @@ -84,6 +85,14 @@ class CEdge {
*/
inline unsigned long GetNode(unsigned long iEdge, unsigned long iNode) const { return Nodes(iEdge,iNode); }

/*!
* \brief SIMD version of GetNode, iNode returned for multiple sequential iEdges
*/
template<class IndexSIMD_t>
FORCEINLINE IndexSIMD_t GetNode(unsigned long iEdge, unsigned long iNode) const {
return IndexSIMD_t(&Nodes(iEdge,iNode));
}

/*!
* \brief Set the node indices of an edge.
* \param[in] iEdge - Edge index.
Expand Down Expand Up @@ -168,6 +177,11 @@ class CEdge {
*/
inline const su2double* GetNormal(unsigned long iEdge) const { return Normal[iEdge]; }

/*!
* \brief Get the entire matrix of edge normals.
*/
inline const su2activematrix& GetNormal() const { return Normal; }

/*!
* \brief Initialize normal vector to 0.
*/
Expand Down
5 changes: 5 additions & 0 deletions Common/include/geometry/dual_grid/CPoint.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,11 @@ class CPoint {
*/
inline su2double *GetCoord(unsigned long iPoint) { return Coord[iPoint]; }

/*!
* \brief Get the entire matrix of coordinates of the control volumes.
*/
inline const su2activematrix& GetCoord() const { return Coord; }

/*!
* \brief Set the coordinates for the control volume.
* \param[in] iPoint - Index of the point.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,7 @@
* \brief Radial basis function interpolation.
*/
class CRadialBasisFunction final : public CInterpolator {
static_assert(su2passivematrix::Storage == StorageType::RowMajor,
"This class relies on row major storage throughout.");
static_assert(su2passivematrix::IsRowMajor, "This class relies on row major storage throughout.");
private:
unsigned long MinDonors = 0, AvgDonors = 0, MaxDonors = 0;
passivedouble Density = 0.0, AvgCorrection = 0.0, MaxCorrection = 0.0;
Expand Down
62 changes: 62 additions & 0 deletions Common/include/linear_algebra/CSysMatrix.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@

#include "../../include/mpi_structure.hpp"
#include "../../include/omp_structure.hpp"
#include "../../include/parallelization/vectorization.hpp"
#include "CSysVector.hpp"
#include "CPastixWrapper.hpp"

Expand Down Expand Up @@ -558,6 +559,40 @@ class CSysMatrix {
UpdateBlocks<OtherType,-1>(iEdge, iPoint, jPoint, block_i, block_j);
}

/*!
* \brief SIMD version, does the update for multiple edges and points.
*/
template<class MatrixSIMD_t, class T, size_t N>
FORCEINLINE void UpdateBlocks(simd::Array<T,N> iEdge, simd::Array<T,N> iPoint, simd::Array<T,N> jPoint,
const MatrixSIMD_t& block_i, const MatrixSIMD_t& block_j) {

/*--- Fetch the blocks for all edges. ---*/
ScalarType* bii[N] = {nullptr};
ScalarType* bjj[N] = {nullptr};
ScalarType* bij[N] = {nullptr};
ScalarType* bji[N] = {nullptr};

for (auto k = 0ul; k < N; ++k) {
bii[k] = &matrix[dia_ptr[iPoint[k]]*block_i.size()];
bjj[k] = &matrix[dia_ptr[jPoint[k]]*block_i.size()];
bij[k] = &matrix[edge_ptr(iEdge[k],0)*block_i.size()];
bji[k] = &matrix[edge_ptr(iEdge[k],1)*block_i.size()];
}

/*--- Unpack the SIMD elements of the input blocks. ---*/
for (auto iVar = 0ul; iVar < block_i.rows(); ++iVar) {
for (auto jVar = 0ul; jVar < block_i.cols(); ++jVar) {
SU2_OMP_SIMD
for (auto k = 0ul; k < N; ++k) {
*(bii[k]++) += PassiveAssign(block_i(iVar,jVar)[k]);
*(bij[k]++) = PassiveAssign(block_j(iVar,jVar)[k]);
*(bji[k]++) = -PassiveAssign(block_i(iVar,jVar)[k]);
*(bjj[k]++) -= PassiveAssign(block_j(iVar,jVar)[k]);
}
}
}
}

/*!
* \brief Sets 2 blocks ij and ji (add to i* sub from j*) associated with
* one edge of an FVM-type sparse pattern.
Expand Down Expand Up @@ -602,6 +637,33 @@ class CSysMatrix {
SetBlocks<OtherType,-1,false>(iEdge, block_i, block_j);
}

/*!
* \brief SIMD version, does the update for multiple edges.
*/
template<class MatrixSIMD_t, class T, size_t N>
FORCEINLINE void SetBlocks(simd::Array<T,N> iEdge, const MatrixSIMD_t& block_i, const MatrixSIMD_t& block_j) {

/*--- Fetch blocks for all edges. ---*/
ScalarType* bij[N] = {nullptr};
ScalarType* bji[N] = {nullptr};

for (auto k = 0ul; k < N; ++k) {
bij[k] = &matrix[edge_ptr(iEdge[k],0)*block_i.size()];
bji[k] = &matrix[edge_ptr(iEdge[k],1)*block_i.size()];
}

/*--- Unpack the SIMD elements of the input blocks. ---*/
for (auto iVar = 0ul; iVar < block_i.rows(); ++iVar) {
for (auto jVar = 0ul; jVar < block_i.cols(); ++jVar) {
SU2_OMP_SIMD
for (auto k = 0ul; k < N; ++k) {
*(bij[k]++) = PassiveAssign(block_j(iVar,jVar)[k]);
*(bji[k]++) = -PassiveAssign(block_i(iVar,jVar)[k]);
}
}
}
}

/*!
* \brief Sets the specified block to the (i, i) subblock of the sparse matrix.
* Scales the input block by factor alpha. If the Overwrite parameter is
Expand Down
38 changes: 38 additions & 0 deletions Common/include/linear_algebra/CSysVector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@
#include <cmath>
#include <cstdlib>

#include "../basic_types/datatype_structure.hpp"
#include "../omp_structure.hpp"
#include "../parallelization/vectorization.hpp"

/*!
* \class CSysVector
Expand Down Expand Up @@ -329,6 +332,41 @@ class CSysVector {
vec_val[val_ipoint*nVar+iVar] = val_residual[iVar];
}

/*!
* \brief Vectorized version of SetBlock, sets multiple iPoint's.
* \param[in] Overwrite - True: write over existing data; False: add to existing data.
* \param[in] iPoint - SIMD integer, the positions to update.
* \param[in] vector - Vector of SIMD scalars.
* \param[in] alpha - Optional scale factor (axpy type operation).
*/
template<class T, size_t N, class VectorSIMD_t, bool Overwrite = true>
FORCEINLINE void SetBlock(simd::Array<T,N> iPoint, const VectorSIMD_t& vector, ScalarType alpha = 1) {
const auto nVar = vector.rows();
for (auto iVar = 0ul; iVar < nVar; ++iVar) {
SU2_OMP_SIMD
for (auto k = 0ul; k < N; ++k) {
vec_val[iPoint[k]*nVar+iVar] *= 1-Overwrite;
vec_val[iPoint[k]*nVar+iVar] += alpha * vector(iVar)[k];
}
}
}

/*!
* \brief Vectorized version of AddBlock, see SetBlock.
*/
template<class T, size_t N, class VectorSIMD_t>
FORCEINLINE void AddBlock(simd::Array<T,N> iPoint, const VectorSIMD_t& vector, ScalarType alpha = 1) {
SetBlock<T, N, VectorSIMD_t, false>(iPoint, vector, alpha);
}

/*!
* \brief Vectorized version of SubtractBlock, see SetBlock.
*/
template<class T, size_t N, class VectorSIMD_t>
FORCEINLINE void SubtractBlock(simd::Array<T,N> iPoint, const VectorSIMD_t& vector) {
SetBlock<T, N, VectorSIMD_t, false>(iPoint, vector, -1);
}

/*!
* \brief Set the residual to zero.
* \param[in] val_ipoint - index of the point where set the residual.
Expand Down
Loading