Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use vectorization when the numerical scheme supports it #1752

Merged
merged 24 commits into from
Sep 26, 2022

Conversation

pcarruscag
Copy link
Member

Recent changes for hybrid parallel AD made it simpler to use the physical reconstruction checks in vectorized schemes (an old TODO).
Consequently, the vectorized Roe implementation is now equivalent to the scalar version, and thus the user no longer needs to "opt-in".

@WallyMaier
Copy link
Contributor

This looks good to me!

@@ -132,10 +132,12 @@ FORCEINLINE CPair<ReconVarType> reconstructPrimitives(Int iEdge, Int iPoint, Int
break;
}
/*--- Detect a non-physical reconstruction based on negative pressure or density. ---*/
const Double neg_p_or_rho = fmax(fmin(V.i.pressure(), V.j.pressure()) < 0.0,
fmin(V.i.density(), V.j.density()) < 0.0);
/*--- Some weird issues with forward AD if this is all done in one line. ---*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened exactly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of the one line version was 1e-310, but I could not reproduce in a unit test, and also didnt get anything from valgrind. So i'm not sure, might have something to do with the expression types interfeering with each other?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 0 or 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I managed to reproduce it, I'll look into it a little further.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The std is not the point, just that writing

VecExpr::max_::operator[] (size_t i) -> /*auto deduced*/ { return codi::fmax(u[i], v[i]); }

requires the cmath version of fmax to be accessible like this, too. https://en.cppreference.com/w/cpp/numeric/math/fmax looks to me as if they are in std, though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, the suggested solution with passivedouble and fmin/fmax is specific to the su2double instantiation of the vector expressions. But maybe we could introduce some small traits that choose what I suggested for su2double and default to the previous solution for everything else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually fmin and fmax work without namespace specification.
We can use fmin/max as the implementation for now and leave a note that they will not be efficient for integers.
I dont think we use that at the moment, Ill double check when I have some time and make the changes, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument-dependent lookup should select the CoDiPack overloads automatically. Alright.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fix covers everything, active to passive, and suitable overloads.
Thank you for getting to the root cause.

Comment on lines +174 to +175
Double scale =
tauWall_ij / fmax(norm(tangentProjection(tau,unitNormal)), EPS) + (1.0-isNormalEdge);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
These div-by-zero risks are actually a huge drawback of using y+ instead of y* as the non-dimensional length scale. I'll look into this 'soon'.

Comment on lines +61 to +71
constexpr size_t preferredLen<su2double>() {
#ifdef CODI_REVERSE_TYPE
/*--- Use a SIMD size of 1 for reverse AD, larger sizes increase
* the pre-accumulation time with no performance benefit. ---*/
return 1;
#else
/*--- For forward AD there is a performance benefit. This covers
* forward AD and primal mode (su2double == passivedouble). ---*/
return PREFERRED_SIZE / sizeof(passivedouble);
#endif
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this. 👍 I rechecked also with linear index handling and observed no benefits either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep maybe I confused it between non-vectorized and vectorized (which lumps more in a single preacc region).

@pcarruscag pcarruscag merged commit bc6ef2a into develop Sep 26, 2022
@pcarruscag pcarruscag deleted the vectorize_when_possible branch September 26, 2022 18:41
RichRoos added a commit to RichRoos/SU2_feature_eN that referenced this pull request Sep 29, 2022
commit 6e4e46e
Merge: 3c783fa a6bae1f
Author: Johannes Blühdorn <55186095+jblueh@users.noreply.github.com>
Date:   Thu Sep 29 13:30:09 2022 +0200

    Merge pull request su2code#1764 from su2code/misc_fixes

    Cleanup/fix regression tests and other small fixes

commit a6bae1f
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Thu Sep 29 10:53:41 2022 +0200

    Function for repeated code to allow MPI as root.

    Fix indentation.

commit 79a3bf9
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 16:44:46 2022 +0200

    Fix.

commit 9cced8e
Merge: dc6b77d 3c783fa
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 15:50:18 2022 +0200

    Merge branch 'develop' into misc_fixes

commit dc6b77d
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 15:00:32 2022 +0200

    Filediff tests don't need tolerances.

commit 977812c
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 14:53:26 2022 +0200

    Timeout and tol defaults for parallel_regression.py.

commit 94f4b23
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 14:37:41 2022 +0200

    Timeout and tol defaults for serial_regression.py.

commit 2c7cd33
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 14:01:23 2022 +0200

    Timeout and tol defaults for tutorials.py.

commit a239b46
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 13:49:22 2022 +0200

    Timeout and tol defaults for serial_regression_AD.py, updates.

commit 627f045
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 13:38:24 2022 +0200

    Small updates.

commit 4b151b8
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 12:23:52 2022 +0200

    Defaults on the TestCase level can be ambiguous.

commit 5e728c9
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 12:21:05 2022 +0200

    Timeout and tol defaults for parallel_regression_AD.py.

commit bf4d162
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 12:00:46 2022 +0200

    Make default timeout and tol visible.

commit 36c5698
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 28 11:58:06 2022 +0200

    Ensure that the test script does not kill itself.

commit 3c783fa
Merge: bc6ef2a 380f9f4
Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>
Date:   Tue Sep 27 19:24:45 2022 +0100

    Merge pull request su2code#1771 from su2code/improve_doxydocs

    Improve doxygen documentation

commit 1f15378
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Tue Sep 27 00:38:40 2022 +0200

    Update vandv.py and tutorials.py, reduce boilerplate.

    Default to the command "mpirun -n 2 SU2_CFD".

commit a57df4a
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Mon Sep 26 22:54:26 2022 +0200

    Update serial_regression_AD.py, reduce boilerplate.

    Default to the command "SU2_CFD_AD".

commit 39b8a4e
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Mon Sep 26 22:41:17 2022 +0200

    Update serial_regression.py, reduce boilerplate.

    Default to the command "SU2_CFD".

commit f86a4f8
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Mon Sep 26 22:17:10 2022 +0200

    Update hybrid and hybrid AD regression tests.

commit fcfa892
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Mon Sep 26 22:10:33 2022 +0200

    Update parallel_regression_AD.py, reduce boilerplate.

    Default to the command "mpirun -n 2 SU2_CFD_AD".
    Phase out parallel_computation.py.

commit 380f9f4
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 26 19:45:24 2022 +0100

    add missing SIMD stuff to group

commit a01e788
Merge: 8d25605 bc6ef2a
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 26 19:42:32 2022 +0100

    Merge remote-tracking branch 'upstream/develop' into improve_doxydocs

commit bc6ef2a
Merge: d32ccec c9af050
Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>
Date:   Mon Sep 26 19:41:53 2022 +0100

    Merge pull request su2code#1752 from su2code/vectorize_when_possible

    Always use vectorization when the numerical scheme supports it

commit 8026ae9
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Mon Sep 26 20:08:59 2022 +0200

    Update parallel_regression.py, reduce boilerplate.

    Default to the command "mpirun -n 2 SU2_CFD".
    Phase out parallel_computation.py.

commit 7cfaf54
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Mon Sep 26 19:46:34 2022 +0200

    Introduce a Command class and tidy up TestCase.py.

commit d32ccec
Merge: 2642f7f 205ec3f
Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>
Date:   Mon Sep 26 09:03:49 2022 +0100

    Merge pull request su2code#1772 from su2code/feature_hom_quickmerge

    Quick fix to a bug in CFlowOutput.cpp that crashes the FEM-DG solver

commit 205ec3f
Author: Zan-AA <xu.zan.2011@vjc.sg>
Date:   Sun Sep 25 15:22:56 2022 -0700

    add to AUTHORS.md

commit e0f7088
Author: Zan Xu <xu.zan.2011@vjc.sg>
Date:   Sun Sep 25 15:21:10 2022 -0700

    Update SU2_CFD/src/output/CFlowCompFEMOutput.cpp

    Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>

commit f28296d
Author: Zan Xu <xu.zan.2011@vjc.sg>
Date:   Sun Sep 25 15:20:48 2022 -0700

    Update SU2_CFD/src/output/CFlowCompFEMOutput.cpp

    Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>

commit 09e8e7f
Author: Zan-AA <xu.zan.2011@vjc.sg>
Date:   Sun Sep 25 11:56:31 2022 -0700

    mute the SetAnalyzeSurface function in the FEM-DG solver to aovid calling GetNode() function, and create an error message

commit 0628c9b
Author: Zan-AA <xu.zan.2011@vjc.sg>
Date:   Sun Sep 25 10:47:55 2022 -0700

    add a conditional operator in CFlowOutput.cpp to avoid GetNodes() function calling in DG solver, which results in assertion error

commit 8d25605
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 25 11:33:54 2022 +0100

    group for VecExpr

commit 0c1b07b
Merge: e616fa1 c9af050
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 25 11:24:59 2022 +0100

    Merge branch 'vectorize_when_possible' into improve_doxydocs

commit c9af050
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 25 11:21:31 2022 +0100

    Revert "Revert "CoDiPack update.""

    This reverts commit fedec9e.

commit f963ffd
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 25 11:20:48 2022 +0100

    convert the result of comparissons to passive type

commit fedec9e
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 24 20:28:12 2022 +0100

    Revert "CoDiPack update."

    This reverts commit 750def1.

commit e616fa1
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 24 16:22:33 2022 +0100

    small fixes and add more groups

commit 142587b
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 24 16:06:42 2022 +0100

    add some instructions to generate the docs, cleanup and tweak settings

commit 92bfafc
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 24 15:35:56 2022 +0100

    move doc files

commit 0b5991a
Merge: 59b5041 750def1
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 24 14:38:18 2022 +0100

    Merge remote-tracking branch 'upstream/vectorize_when_possible' into vectorize_when_possible

commit 59b5041
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 24 14:36:28 2022 +0100

    fix min/max problems? remove USE_VECTORIZATION from testcases

commit 750def1
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Fri Sep 23 14:57:35 2022 +0200

    CoDiPack update.

commit 92fdc3d
Merge: 87f451e 2642f7f
Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>
Date:   Wed Sep 21 19:04:22 2022 +0200

    Merge branch 'develop' into vectorize_when_possible

commit 2642f7f
Merge: 9a7c038 769ae5c
Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>
Date:   Wed Sep 21 19:03:48 2022 +0200

    Merge pull request su2code#1766 from su2code/master

    develop <- master

commit 735abb6
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Tue Sep 20 19:35:19 2022 +0100

    add groups for a bunch of stuff

commit 9a7c038
Author: Nijso <nijso@hotmail.com>
Date:   Tue Sep 20 16:22:44 2022 +0200

    FFD box fix for nonplanar faces (su2code#1742)

    * FFD box uses supporting point in the middle of the face to avoid ambiguous definition of nonplanar faces

    * implementation of github review comments (formatting and deleting memory)

    * Apply suggestions from code review

    Co-authored-by: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>

commit 87f451e
Author: Pedro Gomes <38071223+pcarruscag@users.noreply.github.com>
Date:   Mon Sep 19 12:41:17 2022 +0100

    Update SU2_CFD/include/numerics_simd/flow/convection/common.hpp

    Co-authored-by: Nijso <nijso@hotmail.com>

commit 37b4f58
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 19 12:39:15 2022 +0100

    updates after simd size of 1 for reverse AD

commit 4357145
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 18 21:24:46 2022 +0100

    regression updates and SIMD size 1 for reverse AD

commit c0a8d7c
Merge: 6a680ec 88c8392
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 17 16:22:47 2022 +0100

    Merge remote-tracking branch 'upstream/develop' into vectorize_when_possible

commit 2c016a7
Merge: 11ee7fa 88c8392
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Fri Sep 16 16:04:49 2022 +0200

    Merge branch 'develop' into misc_fixes

commit 11ee7fa
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Fri Sep 16 16:03:32 2022 +0200

    Adapt tutorials.py and vandv.py.

commit 6a680ec
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Wed Sep 14 21:45:47 2022 +0100

    work around some forward mode issues

commit c372fa2
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 14 19:15:03 2022 +0200

    Fix.

commit 4a54740
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Wed Sep 14 19:14:55 2022 +0200

    Ensure to pass killall the correct process name.

commit 88bdc9d
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Tue Sep 13 15:55:18 2022 +0200

    Adapt ninja command for builds outside the SU2 directory.

commit 713fb81
Author: Johannes Blühdorn <johannes.bluehdorn@scicomp.uni-kl.de>
Date:   Tue Sep 13 15:51:35 2022 +0200

    Fix const inconsistency between declaration/definition.

commit 1166497
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 12 18:33:51 2022 +0100

    serial and v&v tests

commit c808e8d
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 12 12:41:00 2022 +0100

    update parallel regression

commit ac58780
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 12 12:31:03 2022 +0100

    update hybrid regression

commit 41aa1d8
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Mon Sep 12 11:54:21 2022 +0100

    prevent nans with wall functions

commit ef4de7d
Merge: b8515c7 013c3cd
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 23:00:56 2022 +0100

    Merge remote-tracking branch 'upstream/develop' into vectorize_when_possible

commit b8515c7
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 23:00:11 2022 +0100

    MG segfault

commit 004b67b
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 21:20:30 2022 +0100

    fix leak

commit 3f6af89
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 18:57:31 2022 +0100

    this is what "using namespace std" does

commit c841509
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 15:04:56 2022 +0100

    proper fix

commit 5f0e319
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 14:19:19 2022 +0100

    fix build and some doxygen cleanup

commit 2edca53
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sun Sep 11 11:45:53 2022 +0100

    automatically use vectorized Roe when possible

commit 6c53381
Author: Pedro Gomes <pcarruscag@gmail.com>
Date:   Sat Sep 10 22:01:07 2022 +0100

    use the non physical counter in SIMD numerics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants