Always use vectorization when the numerical scheme supports it #1752

pcarruscag · 2022-09-11T10:55:09Z

Recent changes for hybrid parallel AD made it simpler to use the physical reconstruction checks in vectorized schemes (an old TODO).
Consequently, the vectorized Roe implementation is now equivalent to the scalar version, and thus the user no longer needs to "opt-in".

…ossible

WallyMaier · 2022-09-14T00:43:14Z

This looks good to me!

jblueh · 2022-09-15T12:39:46Z

SU2_CFD/include/numerics_simd/flow/convection/common.hpp

@@ -132,10 +132,12 @@ FORCEINLINE CPair<ReconVarType> reconstructPrimitives(Int iEdge, Int iPoint, Int
      break;
    }
    /*--- Detect a non-physical reconstruction based on negative pressure or density. ---*/
-    const Double neg_p_or_rho = fmax(fmin(V.i.pressure(), V.j.pressure()) < 0.0,
-                                     fmin(V.i.density(), V.j.density()) < 0.0);
+    /*--- Some weird issues with forward AD if this is all done in one line. ---*/


What happened exactly?

The result of the one line version was 1e-310, but I could not reproduce in a unit test, and also didnt get anything from valgrind. So i'm not sure, might have something to do with the expression types interfeering with each other?

Instead of 0 or 1.

I think I managed to reproduce it, I'll look into it a little further.

Awesome thanks

The std is not the point, just that writing

VecExpr::max_::operator[] (size_t i) -> /*auto deduced*/ { return codi::fmax(u[i], v[i]); }

requires the cmath version of fmax to be accessible like this, too. https://en.cppreference.com/w/cpp/numeric/math/fmax looks to me as if they are in std, though?

True, the suggested solution with passivedouble and fmin/fmax is specific to the su2double instantiation of the vector expressions. But maybe we could introduce some small traits that choose what I suggested for su2double and default to the previous solution for everything else?

Usually fmin and fmax work without namespace specification.
We can use fmin/max as the implementation for now and leave a note that they will not be efficient for integers.
I dont think we use that at the moment, Ill double check when I have some time and make the changes, thanks.

Argument-dependent lookup should select the CoDiPack overloads automatically. Alright.

I think the fix covers everything, active to passive, and suitable overloads.
Thank you for getting to the root cause.

…ossible

SU2_CFD/include/numerics_simd/flow/convection/common.hpp

bigfooted · 2022-09-19T08:18:33Z

SU2_CFD/include/numerics_simd/flow/diffusion/common.hpp

+  Double scale =
+      tauWall_ij / fmax(norm(tangentProjection(tau,unitNormal)), EPS) + (1.0-isNormalEdge);


👍
These div-by-zero risks are actually a huge drawback of using y+ instead of y* as the non-dimensional length scale. I'll look into this 'soon'.

SU2_CFD/include/solvers/CFVMFlowSolverBase.hpp

SU2_CFD/include/variables/CFlowVariable.hpp

Co-authored-by: Nijso <nijso@hotmail.com>

jblueh · 2022-09-20T13:09:34Z

Common/include/parallelization/vectorization.hpp

+constexpr size_t preferredLen<su2double>() {
+#ifdef CODI_REVERSE_TYPE
+  /*--- Use a SIMD size of 1 for reverse AD, larger sizes increase
+   * the pre-accumulation time with no performance benefit. ---*/
+  return 1;
+#else
+  /*--- For forward AD there is a performance benefit. This covers
+   * forward AD and primal mode (su2double == passivedouble). ---*/
+  return PREFERRED_SIZE / sizeof(passivedouble);
+#endif
+}


Thanks for adding this. 👍 I rechecked also with linear index handling and observed no benefits either.

Yep maybe I confused it between non-vectorized and vectorized (which lumps more in a single preacc region).

…vectorize_when_possible

This reverts commit 750def1.

This reverts commit fedec9e.

commit 6e4e46e Merge: 3c783fa a6bae1f Author: Johannes Blühdorn <55186095+jblueh@users.noreply.github.com> Date: Thu Sep 29 13:30:09 2022 +0200 Merge pull request su2code#1764 from su2code/misc_fixes Cleanup/fix regression tests and other small fixes commit a6bae1f Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Thu Sep 29 10:53:41 2022 +0200 Function for repeated code to allow MPI as root. Fix indentation. commit 79a3bf9 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 16:44:46 2022 +0200 Fix. commit 9cced8e Merge: dc6b77d 3c783fa Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 15:50:18 2022 +0200 Merge branch 'develop' into misc_fixes commit dc6b77d Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 15:00:32 2022 +0200 Filediff tests don't need tolerances. commit 977812c Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 14:53:26 2022 +0200 Timeout and tol defaults for parallel_regression.py. commit 94f4b23 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 14:37:41 2022 +0200 Timeout and tol defaults for serial_regression.py. commit 2c7cd33 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 14:01:23 2022 +0200 Timeout and tol defaults for tutorials.py. commit a239b46 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 13:49:22 2022 +0200 Timeout and tol defaults for serial_regression_AD.py, updates. commit 627f045 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 13:38:24 2022 +0200 Small updates. commit 4b151b8 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 12:23:52 2022 +0200 Defaults on the TestCase level can be ambiguous. commit 5e728c9 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 12:21:05 2022 +0200 Timeout and tol defaults for parallel_regression_AD.py. commit bf4d162 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 12:00:46 2022 +0200 Make default timeout and tol visible. commit 36c5698 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 11:58:06 2022 +0200 Ensure that the test script does not kill itself. commit 3c783fa Merge: bc6ef2a 380f9f4 Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Tue Sep 27 19:24:45 2022 +0100 Merge pull request su2code#1771 from su2code/improve_doxydocs Improve doxygen documentation commit 1f15378 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Tue Sep 27 00:38:40 2022 +0200 Update vandv.py and tutorials.py, reduce boilerplate. Default to the command "mpirun -n 2 SU2_CFD". commit a57df4a Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:54:26 2022 +0200 Update serial_regression_AD.py, reduce boilerplate. Default to the command "SU2_CFD_AD". commit 39b8a4e Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:41:17 2022 +0200 Update serial_regression.py, reduce boilerplate. Default to the command "SU2_CFD". commit f86a4f8 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:17:10 2022 +0200 Update hybrid and hybrid AD regression tests. commit fcfa892 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:10:33 2022 +0200 Update parallel_regression_AD.py, reduce boilerplate. Default to the command "mpirun -n 2 SU2_CFD_AD". Phase out parallel_computation.py. commit 380f9f4 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 26 19:45:24 2022 +0100 add missing SIMD stuff to group commit a01e788 Merge: 8d25605 bc6ef2a Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 26 19:42:32 2022 +0100 Merge remote-tracking branch 'upstream/develop' into improve_doxydocs commit bc6ef2a Merge: d32ccec c9af050 Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Mon Sep 26 19:41:53 2022 +0100 Merge pull request su2code#1752 from su2code/vectorize_when_possible Always use vectorization when the numerical scheme supports it commit 8026ae9 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 20:08:59 2022 +0200 Update parallel_regression.py, reduce boilerplate. Default to the command "mpirun -n 2 SU2_CFD". Phase out parallel_computation.py. commit 7cfaf54 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 19:46:34 2022 +0200 Introduce a Command class and tidy up TestCase.py. commit d32ccec Merge: 2642f7f 205ec3f Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Mon Sep 26 09:03:49 2022 +0100 Merge pull request su2code#1772 from su2code/feature_hom_quickmerge Quick fix to a bug in CFlowOutput.cpp that crashes the FEM-DG solver commit 205ec3f Author: Zan-AA <xu.zan.2011@vjc.sg> Date: Sun Sep 25 15:22:56 2022 -0700 add to AUTHORS.md commit e0f7088 Author: Zan Xu <xu.zan.2011@vjc.sg> Date: Sun Sep 25 15:21:10 2022 -0700 Update SU2_CFD/src/output/CFlowCompFEMOutput.cpp Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> commit f28296d Author: Zan Xu <xu.zan.2011@vjc.sg> Date: Sun Sep 25 15:20:48 2022 -0700 Update SU2_CFD/src/output/CFlowCompFEMOutput.cpp Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> commit 09e8e7f Author: Zan-AA <xu.zan.2011@vjc.sg> Date: Sun Sep 25 11:56:31 2022 -0700 mute the SetAnalyzeSurface function in the FEM-DG solver to aovid calling GetNode() function, and create an error message commit 0628c9b Author: Zan-AA <xu.zan.2011@vjc.sg> Date: Sun Sep 25 10:47:55 2022 -0700 add a conditional operator in CFlowOutput.cpp to avoid GetNodes() function calling in DG solver, which results in assertion error commit 8d25605 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:33:54 2022 +0100 group for VecExpr commit 0c1b07b Merge: e616fa1 c9af050 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:24:59 2022 +0100 Merge branch 'vectorize_when_possible' into improve_doxydocs commit c9af050 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:21:31 2022 +0100 Revert "Revert "CoDiPack update."" This reverts commit fedec9e. commit f963ffd Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:20:48 2022 +0100 convert the result of comparissons to passive type commit fedec9e Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 20:28:12 2022 +0100 Revert "CoDiPack update." This reverts commit 750def1. commit e616fa1 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 16:22:33 2022 +0100 small fixes and add more groups commit 142587b Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 16:06:42 2022 +0100 add some instructions to generate the docs, cleanup and tweak settings commit 92bfafc Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 15:35:56 2022 +0100 move doc files commit 0b5991a Merge: 59b5041 750def1 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 14:38:18 2022 +0100 Merge remote-tracking branch 'upstream/vectorize_when_possible' into vectorize_when_possible commit 59b5041 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 14:36:28 2022 +0100 fix min/max problems? remove USE_VECTORIZATION from testcases commit 750def1 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Fri Sep 23 14:57:35 2022 +0200 CoDiPack update. commit 92fdc3d Merge: 87f451e 2642f7f Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Wed Sep 21 19:04:22 2022 +0200 Merge branch 'develop' into vectorize_when_possible commit 2642f7f Merge: 9a7c038 769ae5c Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Wed Sep 21 19:03:48 2022 +0200 Merge pull request su2code#1766 from su2code/master develop <- master commit 735abb6 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Tue Sep 20 19:35:19 2022 +0100 add groups for a bunch of stuff commit 9a7c038 Author: Nijso <nijso@hotmail.com> Date: Tue Sep 20 16:22:44 2022 +0200 FFD box fix for nonplanar faces (su2code#1742) * FFD box uses supporting point in the middle of the face to avoid ambiguous definition of nonplanar faces * implementation of github review comments (formatting and deleting memory) * Apply suggestions from code review Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> commit 87f451e Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Mon Sep 19 12:41:17 2022 +0100 Update SU2_CFD/include/numerics_simd/flow/convection/common.hpp Co-authored-by: Nijso <nijso@hotmail.com> commit 37b4f58 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 19 12:39:15 2022 +0100 updates after simd size of 1 for reverse AD commit 4357145 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 18 21:24:46 2022 +0100 regression updates and SIMD size 1 for reverse AD commit c0a8d7c Merge: 6a680ec 88c8392 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 17 16:22:47 2022 +0100 Merge remote-tracking branch 'upstream/develop' into vectorize_when_possible commit 2c016a7 Merge: 11ee7fa 88c8392 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Fri Sep 16 16:04:49 2022 +0200 Merge branch 'develop' into misc_fixes commit 11ee7fa Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Fri Sep 16 16:03:32 2022 +0200 Adapt tutorials.py and vandv.py. commit 6a680ec Author: Pedro Gomes <pcarruscag@gmail.com> Date: Wed Sep 14 21:45:47 2022 +0100 work around some forward mode issues commit c372fa2 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 14 19:15:03 2022 +0200 Fix. commit 4a54740 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 14 19:14:55 2022 +0200 Ensure to pass killall the correct process name. commit 88bdc9d Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Tue Sep 13 15:55:18 2022 +0200 Adapt ninja command for builds outside the SU2 directory. commit 713fb81 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Tue Sep 13 15:51:35 2022 +0200 Fix const inconsistency between declaration/definition. commit 1166497 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 18:33:51 2022 +0100 serial and v&v tests commit c808e8d Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 12:41:00 2022 +0100 update parallel regression commit ac58780 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 12:31:03 2022 +0100 update hybrid regression commit 41aa1d8 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 11:54:21 2022 +0100 prevent nans with wall functions commit ef4de7d Merge: b8515c7 013c3cd Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 23:00:56 2022 +0100 Merge remote-tracking branch 'upstream/develop' into vectorize_when_possible commit b8515c7 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 23:00:11 2022 +0100 MG segfault commit 004b67b Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 21:20:30 2022 +0100 fix leak commit 3f6af89 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 18:57:31 2022 +0100 this is what "using namespace std" does commit c841509 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 15:04:56 2022 +0100 proper fix commit 5f0e319 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 14:19:19 2022 +0100 fix build and some doxygen cleanup commit 2edca53 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 11:45:53 2022 +0100 automatically use vectorized Roe when possible commit 6c53381 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 10 22:01:07 2022 +0100 use the non physical counter in SIMD numerics

pcarruscag added 2 commits September 10, 2022 22:01

use the non physical counter in SIMD numerics

6c53381

automatically use vectorized Roe when possible

2edca53

pcarruscag added the changelog:feature label Sep 11, 2022

pr-triage bot added the PR: unreviewed label Sep 11, 2022

pcarruscag added 10 commits September 11, 2022 14:19

fix build and some doxygen cleanup

5f0e319

proper fix

c841509

this is what "using namespace std" does

3f6af89

fix leak

004b67b

MG segfault

b8515c7

Merge remote-tracking branch 'upstream/develop' into vectorize_when_p…

ef4de7d

…ossible

prevent nans with wall functions

41aa1d8

update hybrid regression

ac58780

update parallel regression

c808e8d

serial and v&v tests

1166497

work around some forward mode issues

6a680ec

jblueh reviewed Sep 15, 2022

View reviewed changes

pcarruscag added 2 commits September 17, 2022 16:22

Merge remote-tracking branch 'upstream/develop' into vectorize_when_p…

c0a8d7c

…ossible

regression updates and SIMD size 1 for reverse AD

4357145

bigfooted reviewed Sep 19, 2022

View reviewed changes

SU2_CFD/include/numerics_simd/flow/convection/common.hpp Outdated Show resolved Hide resolved

bigfooted reviewed Sep 19, 2022

View reviewed changes

SU2_CFD/include/solvers/CFVMFlowSolverBase.hpp Show resolved Hide resolved

bigfooted reviewed Sep 19, 2022

View reviewed changes

SU2_CFD/include/variables/CFlowVariable.hpp Show resolved Hide resolved

pcarruscag and others added 2 commits September 19, 2022 12:39

updates after simd size of 1 for reverse AD

37b4f58

Update SU2_CFD/include/numerics_simd/flow/convection/common.hpp

87f451e

Co-authored-by: Nijso <nijso@hotmail.com>

jblueh reviewed Sep 20, 2022

View reviewed changes

pcarruscag and others added 3 commits September 21, 2022 19:04

Merge branch 'develop' into vectorize_when_possible

92fdc3d

CoDiPack update.

750def1

fix min/max problems? remove USE_VECTORIZATION from testcases

59b5041

pcarruscag added 4 commits September 24, 2022 14:38

Merge remote-tracking branch 'upstream/vectorize_when_possible' into …

0b5991a

…vectorize_when_possible

Revert "CoDiPack update."

fedec9e

This reverts commit 750def1.

convert the result of comparissons to passive type

f963ffd

Revert "Revert "CoDiPack update.""

c9af050

This reverts commit fedec9e.

pcarruscag merged commit bc6ef2a into develop Sep 26, 2022

pcarruscag deleted the vectorize_when_possible branch September 26, 2022 18:41

pr-triage bot added PR: merged and removed PR: unreviewed labels Sep 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Always use vectorization when the numerical scheme supports it #1752

Always use vectorization when the numerical scheme supports it #1752

pcarruscag commented Sep 11, 2022

WallyMaier commented Sep 14, 2022

jblueh Sep 15, 2022

pcarruscag Sep 15, 2022

pcarruscag Sep 15, 2022

jblueh Sep 16, 2022

pcarruscag Sep 16, 2022

jblueh Sep 23, 2022

jblueh Sep 23, 2022

pcarruscag Sep 23, 2022

jblueh Sep 23, 2022

pcarruscag Sep 26, 2022

bigfooted Sep 19, 2022

jblueh Sep 20, 2022

pcarruscag Sep 20, 2022

		Double scale =
		tauWall_ij / fmax(norm(tangentProjection(tau,unitNormal)), EPS) + (1.0-isNormalEdge);

Always use vectorization when the numerical scheme supports it #1752

Always use vectorization when the numerical scheme supports it #1752

Conversation

pcarruscag commented Sep 11, 2022

WallyMaier commented Sep 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment