-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always use vectorization when the numerical scheme supports it #1752
Conversation
This looks good to me! |
@@ -132,10 +132,12 @@ FORCEINLINE CPair<ReconVarType> reconstructPrimitives(Int iEdge, Int iPoint, Int | |||
break; | |||
} | |||
/*--- Detect a non-physical reconstruction based on negative pressure or density. ---*/ | |||
const Double neg_p_or_rho = fmax(fmin(V.i.pressure(), V.j.pressure()) < 0.0, | |||
fmin(V.i.density(), V.j.density()) < 0.0); | |||
/*--- Some weird issues with forward AD if this is all done in one line. ---*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of the one line version was 1e-310, but I could not reproduce in a unit test, and also didnt get anything from valgrind. So i'm not sure, might have something to do with the expression types interfeering with each other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of 0 or 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I managed to reproduce it, I'll look into it a little further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The std
is not the point, just that writing
VecExpr::max_::operator[] (size_t i) -> /*auto deduced*/ { return codi::fmax(u[i], v[i]); }
requires the cmath
version of fmax
to be accessible like this, too. https://en.cppreference.com/w/cpp/numeric/math/fmax looks to me as if they are in std
, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, the suggested solution with passivedouble
and fmin
/fmax
is specific to the su2double
instantiation of the vector expressions. But maybe we could introduce some small traits that choose what I suggested for su2double
and default to the previous solution for everything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually fmin and fmax work without namespace specification.
We can use fmin/max as the implementation for now and leave a note that they will not be efficient for integers.
I dont think we use that at the moment, Ill double check when I have some time and make the changes, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argument-dependent lookup should select the CoDiPack overloads automatically. Alright.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix covers everything, active to passive, and suitable overloads.
Thank you for getting to the root cause.
Double scale = | ||
tauWall_ij / fmax(norm(tangentProjection(tau,unitNormal)), EPS) + (1.0-isNormalEdge); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
These div-by-zero risks are actually a huge drawback of using y+ instead of y* as the non-dimensional length scale. I'll look into this 'soon'.
Co-authored-by: Nijso <nijso@hotmail.com>
constexpr size_t preferredLen<su2double>() { | ||
#ifdef CODI_REVERSE_TYPE | ||
/*--- Use a SIMD size of 1 for reverse AD, larger sizes increase | ||
* the pre-accumulation time with no performance benefit. ---*/ | ||
return 1; | ||
#else | ||
/*--- For forward AD there is a performance benefit. This covers | ||
* forward AD and primal mode (su2double == passivedouble). ---*/ | ||
return PREFERRED_SIZE / sizeof(passivedouble); | ||
#endif | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. 👍 I rechecked also with linear index handling and observed no benefits either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep maybe I confused it between non-vectorized and vectorized (which lumps more in a single preacc region).
commit 6e4e46e Merge: 3c783fa a6bae1f Author: Johannes Blühdorn <55186095+jblueh@users.noreply.github.com> Date: Thu Sep 29 13:30:09 2022 +0200 Merge pull request su2code#1764 from su2code/misc_fixes Cleanup/fix regression tests and other small fixes commit a6bae1f Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Thu Sep 29 10:53:41 2022 +0200 Function for repeated code to allow MPI as root. Fix indentation. commit 79a3bf9 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 16:44:46 2022 +0200 Fix. commit 9cced8e Merge: dc6b77d 3c783fa Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 15:50:18 2022 +0200 Merge branch 'develop' into misc_fixes commit dc6b77d Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 15:00:32 2022 +0200 Filediff tests don't need tolerances. commit 977812c Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 14:53:26 2022 +0200 Timeout and tol defaults for parallel_regression.py. commit 94f4b23 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 14:37:41 2022 +0200 Timeout and tol defaults for serial_regression.py. commit 2c7cd33 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 14:01:23 2022 +0200 Timeout and tol defaults for tutorials.py. commit a239b46 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 13:49:22 2022 +0200 Timeout and tol defaults for serial_regression_AD.py, updates. commit 627f045 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 13:38:24 2022 +0200 Small updates. commit 4b151b8 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 12:23:52 2022 +0200 Defaults on the TestCase level can be ambiguous. commit 5e728c9 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 12:21:05 2022 +0200 Timeout and tol defaults for parallel_regression_AD.py. commit bf4d162 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 12:00:46 2022 +0200 Make default timeout and tol visible. commit 36c5698 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 28 11:58:06 2022 +0200 Ensure that the test script does not kill itself. commit 3c783fa Merge: bc6ef2a 380f9f4 Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Tue Sep 27 19:24:45 2022 +0100 Merge pull request su2code#1771 from su2code/improve_doxydocs Improve doxygen documentation commit 1f15378 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Tue Sep 27 00:38:40 2022 +0200 Update vandv.py and tutorials.py, reduce boilerplate. Default to the command "mpirun -n 2 SU2_CFD". commit a57df4a Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:54:26 2022 +0200 Update serial_regression_AD.py, reduce boilerplate. Default to the command "SU2_CFD_AD". commit 39b8a4e Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:41:17 2022 +0200 Update serial_regression.py, reduce boilerplate. Default to the command "SU2_CFD". commit f86a4f8 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:17:10 2022 +0200 Update hybrid and hybrid AD regression tests. commit fcfa892 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 22:10:33 2022 +0200 Update parallel_regression_AD.py, reduce boilerplate. Default to the command "mpirun -n 2 SU2_CFD_AD". Phase out parallel_computation.py. commit 380f9f4 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 26 19:45:24 2022 +0100 add missing SIMD stuff to group commit a01e788 Merge: 8d25605 bc6ef2a Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 26 19:42:32 2022 +0100 Merge remote-tracking branch 'upstream/develop' into improve_doxydocs commit bc6ef2a Merge: d32ccec c9af050 Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Mon Sep 26 19:41:53 2022 +0100 Merge pull request su2code#1752 from su2code/vectorize_when_possible Always use vectorization when the numerical scheme supports it commit 8026ae9 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 20:08:59 2022 +0200 Update parallel_regression.py, reduce boilerplate. Default to the command "mpirun -n 2 SU2_CFD". Phase out parallel_computation.py. commit 7cfaf54 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Mon Sep 26 19:46:34 2022 +0200 Introduce a Command class and tidy up TestCase.py. commit d32ccec Merge: 2642f7f 205ec3f Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Mon Sep 26 09:03:49 2022 +0100 Merge pull request su2code#1772 from su2code/feature_hom_quickmerge Quick fix to a bug in CFlowOutput.cpp that crashes the FEM-DG solver commit 205ec3f Author: Zan-AA <xu.zan.2011@vjc.sg> Date: Sun Sep 25 15:22:56 2022 -0700 add to AUTHORS.md commit e0f7088 Author: Zan Xu <xu.zan.2011@vjc.sg> Date: Sun Sep 25 15:21:10 2022 -0700 Update SU2_CFD/src/output/CFlowCompFEMOutput.cpp Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> commit f28296d Author: Zan Xu <xu.zan.2011@vjc.sg> Date: Sun Sep 25 15:20:48 2022 -0700 Update SU2_CFD/src/output/CFlowCompFEMOutput.cpp Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> commit 09e8e7f Author: Zan-AA <xu.zan.2011@vjc.sg> Date: Sun Sep 25 11:56:31 2022 -0700 mute the SetAnalyzeSurface function in the FEM-DG solver to aovid calling GetNode() function, and create an error message commit 0628c9b Author: Zan-AA <xu.zan.2011@vjc.sg> Date: Sun Sep 25 10:47:55 2022 -0700 add a conditional operator in CFlowOutput.cpp to avoid GetNodes() function calling in DG solver, which results in assertion error commit 8d25605 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:33:54 2022 +0100 group for VecExpr commit 0c1b07b Merge: e616fa1 c9af050 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:24:59 2022 +0100 Merge branch 'vectorize_when_possible' into improve_doxydocs commit c9af050 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:21:31 2022 +0100 Revert "Revert "CoDiPack update."" This reverts commit fedec9e. commit f963ffd Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 25 11:20:48 2022 +0100 convert the result of comparissons to passive type commit fedec9e Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 20:28:12 2022 +0100 Revert "CoDiPack update." This reverts commit 750def1. commit e616fa1 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 16:22:33 2022 +0100 small fixes and add more groups commit 142587b Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 16:06:42 2022 +0100 add some instructions to generate the docs, cleanup and tweak settings commit 92bfafc Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 15:35:56 2022 +0100 move doc files commit 0b5991a Merge: 59b5041 750def1 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 14:38:18 2022 +0100 Merge remote-tracking branch 'upstream/vectorize_when_possible' into vectorize_when_possible commit 59b5041 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 24 14:36:28 2022 +0100 fix min/max problems? remove USE_VECTORIZATION from testcases commit 750def1 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Fri Sep 23 14:57:35 2022 +0200 CoDiPack update. commit 92fdc3d Merge: 87f451e 2642f7f Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Wed Sep 21 19:04:22 2022 +0200 Merge branch 'develop' into vectorize_when_possible commit 2642f7f Merge: 9a7c038 769ae5c Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Wed Sep 21 19:03:48 2022 +0200 Merge pull request su2code#1766 from su2code/master develop <- master commit 735abb6 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Tue Sep 20 19:35:19 2022 +0100 add groups for a bunch of stuff commit 9a7c038 Author: Nijso <nijso@hotmail.com> Date: Tue Sep 20 16:22:44 2022 +0200 FFD box fix for nonplanar faces (su2code#1742) * FFD box uses supporting point in the middle of the face to avoid ambiguous definition of nonplanar faces * implementation of github review comments (formatting and deleting memory) * Apply suggestions from code review Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> commit 87f451e Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com> Date: Mon Sep 19 12:41:17 2022 +0100 Update SU2_CFD/include/numerics_simd/flow/convection/common.hpp Co-authored-by: Nijso <nijso@hotmail.com> commit 37b4f58 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 19 12:39:15 2022 +0100 updates after simd size of 1 for reverse AD commit 4357145 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 18 21:24:46 2022 +0100 regression updates and SIMD size 1 for reverse AD commit c0a8d7c Merge: 6a680ec 88c8392 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 17 16:22:47 2022 +0100 Merge remote-tracking branch 'upstream/develop' into vectorize_when_possible commit 2c016a7 Merge: 11ee7fa 88c8392 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Fri Sep 16 16:04:49 2022 +0200 Merge branch 'develop' into misc_fixes commit 11ee7fa Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Fri Sep 16 16:03:32 2022 +0200 Adapt tutorials.py and vandv.py. commit 6a680ec Author: Pedro Gomes <pcarruscag@gmail.com> Date: Wed Sep 14 21:45:47 2022 +0100 work around some forward mode issues commit c372fa2 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 14 19:15:03 2022 +0200 Fix. commit 4a54740 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Wed Sep 14 19:14:55 2022 +0200 Ensure to pass killall the correct process name. commit 88bdc9d Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Tue Sep 13 15:55:18 2022 +0200 Adapt ninja command for builds outside the SU2 directory. commit 713fb81 Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de> Date: Tue Sep 13 15:51:35 2022 +0200 Fix const inconsistency between declaration/definition. commit 1166497 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 18:33:51 2022 +0100 serial and v&v tests commit c808e8d Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 12:41:00 2022 +0100 update parallel regression commit ac58780 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 12:31:03 2022 +0100 update hybrid regression commit 41aa1d8 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Mon Sep 12 11:54:21 2022 +0100 prevent nans with wall functions commit ef4de7d Merge: b8515c7 013c3cd Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 23:00:56 2022 +0100 Merge remote-tracking branch 'upstream/develop' into vectorize_when_possible commit b8515c7 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 23:00:11 2022 +0100 MG segfault commit 004b67b Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 21:20:30 2022 +0100 fix leak commit 3f6af89 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 18:57:31 2022 +0100 this is what "using namespace std" does commit c841509 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 15:04:56 2022 +0100 proper fix commit 5f0e319 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 14:19:19 2022 +0100 fix build and some doxygen cleanup commit 2edca53 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sun Sep 11 11:45:53 2022 +0100 automatically use vectorized Roe when possible commit 6c53381 Author: Pedro Gomes <pcarruscag@gmail.com> Date: Sat Sep 10 22:01:07 2022 +0100 use the non physical counter in SIMD numerics
Recent changes for hybrid parallel AD made it simpler to use the physical reconstruction checks in vectorized schemes (an old TODO).
Consequently, the vectorized Roe implementation is now equivalent to the scalar version, and thus the user no longer needs to "opt-in".