Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another charge against pointer to pointer #1312

Merged
merged 11 commits into from
Jul 18, 2021
Merged

Another charge against pointer to pointer #1312

merged 11 commits into from
Jul 18, 2021

Conversation

pcarruscag
Copy link
Member

Proposed Changes

Introduce a "matrix view" type to avoid ** for example when passing gradients of primitives into the numerics.

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags, or simply --warnlevel=2 when using meson).
  • My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.
  • I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.

Comment on lines -1263 to +1262
visc_numerics->SetPrimVarGradient(nodes->GetGradient_Primitive(iPoint), Grad_Reflected);
visc_numerics->SetPrimVarGradient(nodes->GetGradient_Primitive(iPoint), CMatrixView<su2double>(Grad_Reflected));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matrix types have an explicit conversion to this "matrix view", i.e. you can pass su2activematrix to CNumerics instead of a su2double**

CNumerics::ComputeStressTensor(nVar, tau, flow_nodes->GetGradient_Primitive(Point_Flow)+1,Viscosity);
CNumerics::ComputeStressTensor(nVar, tau, flow_nodes->GetGradient_Primitive(Point_Flow,1), Viscosity);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preferred way to get the gradient with an offset, although the "+" arithmetic still works.

SetVolumeOutputValue("Q_CRITERION", iPoint, GetQ_Criterion(&(Node_Flow->GetGradient_Primitive(iPoint)[1])));
SetVolumeOutputValue("Q_CRITERION", iPoint, GetQ_Criterion(Node_Flow->GetGradient_Primitive(iPoint,1)));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this (taking the address) will no longer work.

Comment on lines -115 to -121
* \brief This contrived container is used to store small matrices in a contiguous manner
* but still present the "su2double**" interface to the outside world.
* The "interface" part should be replaced by something more efficient, e.g. a "matrix view".
*/
class CVectorOfMatrix: public C3DDoubleMatrix {
private:
su2matrix<Scalar*> interface;
Copy link
Member Author

@pcarruscag pcarruscag Jun 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main motivation for this change is to store "vectors of matrices" e.g. gradients in a contiguous way without needing an extra matrix of pointers to provide the su2double** compatibility with other areas of the code.

@@ -659,6 +627,58 @@ class CNumerics {
}
}

/*!
* \brief Project average gradient onto normal (with or w/o correction) for viscous fluxes of scalar quantities.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reduce duplication

Copy link
Contributor

@WallyMaier WallyMaier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcarruscag thanks for the work!

This looks good to me.

@@ -1053,8 +1053,7 @@ void CFVMFlowSolverBase<V, R>::BC_Sym_Plane(CGeometry* geometry, CSolver** solve
Tangential[MAXNDIM] = {0.0}, GradNormVel[MAXNDIM] = {0.0}, GradTangVel[MAXNDIM] = {0.0};

/*--- Allocation of primitive gradient arrays for viscous fluxes. ---*/
su2double** Grad_Reflected = new su2double*[nPrimVarGrad];
for (iVar = 0; iVar < nPrimVarGrad; iVar++) Grad_Reflected[iVar] = new su2double[nDim];
su2activematrix Grad_Reflected(nPrimVarGrad, nDim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much nicer, and less of a memory headache. 👍

@@ -198,7 +198,8 @@ CNumerics::ResidualType<> CUpwMSW_NEMO::ComputeResidual(const CConfig *config) {
epsilon*epsilon));

/*--- Compute projected P, invP, and Lambda ---*/
CreateBasis(UnitNormal);
su2double l[MAXNDIM], m[MAXNDIM];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for untangling some of this stuff. Its on the list to go through and completely abstract away....eventually.

@@ -674,7 +674,7 @@ CNumerics::ResidualType<> CSourceWindGust::ComputeResidual(const CConfig* config
smx = rho*(du_gust_dt + (u+u_gust)*du_gust_dx + (v+v_gust)*du_gust_dy);
smy = rho*(dv_gust_dt + (u+u_gust)*dv_gust_dx + (v+v_gust)*dv_gust_dy);
//smz = rho*(dw_gust_dt + (u+u_gust)*dw_gust_dx + (v+v_gust)*dw_gust_dy) + (w+w_gust)*dw_gust_dz;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


SoundSpeed_i = sqrt(ProjVelocity_i*ProjVelocity_i + (BetaInc2_i/DensityInc_i)*Area*Area);
SoundSpeed_j = sqrt(ProjVelocity_j*ProjVelocity_j + (BetaInc2_j/DensityInc_j)*Area*Area);
su2double SoundSpeed_i = sqrt(pow(ProjVelocity_i,2) + (BetaInc2_i/DensityInc_i)*Area2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own knowledge, is pow(x,2) superior to x*x?

Copy link
Member Author

@pcarruscag pcarruscag Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more expressive, saves you the trouble of reading two variable names to make sure they are the same.
In terms of performance, the pow function is horrible, however, pow(x,2) always gets optimized to x*x.
Larger integer powers are only optimized in that way if some -ffast-math optimizations are allowed, since e.g. x*x*x is not a strict way to compute pow(x,3) (as per the floating-point standard).
That type of optimization is fairly innocuous in double-precision, the intel compilers actually do them by default.

const Vec2& var_i, const Vec2& var_j,
su2double* projNormal,
su2double* projCorrected) {
nDim = (nDim > 2)? 3 : 2;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This keeps the compiler from going crazy with loop unrolling thinking that nDim is large, and then creating unrolled code that is never used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is time for

template<class nDim> 
class SU2_CFD { }

:) because this looks really weird, jk

btw I guess this loop unrolling thing is sth you checked with compiler explorer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should add an assert though, for 2 or 3.
Yes compiler explorer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course this is a valid check, but can you motivate that a bit why you think it is additionally necessary here? I guess if we would do this kind of sanity checks of sizes in more places, we would have quite a bit of them. At first glance that looks like a bit overkill but maybe i just dont understand it.

And might this already be enough for the compiler to not unroll further because it puts a hard constraint on the nDim value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the code is only right if called with nDim = 2 or 3, and it is good practice to assert that kind of assumption to facilitate debugging, keep in mind that asserts are no-ops in release builds!
This is also why they cannot be used to inform the compiler that we expect 2 or 3 -> https://gcc.godbolt.org/z/nbPv8KhPo

Copy link
Contributor

@TobiKattmann TobiKattmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks pedro for this 💐 which is a bit more of just a charge against pointer-to-pointer. This ComputeProjectedGradient func and the simplification of the viscous terms numerics for the heat solver. And the const-ing and cleanup.
I have a few things below, none of which are dealbreakers (mostly questions I guess)

Comment on lines +58 to +65
const Scalar* operator[] (Index i) const noexcept { return &m_ptr[i*m_cols]; }
const Scalar& operator() (Index i, Index j) const noexcept { return m_ptr[i*m_cols + j]; }

template<class U = T, su2enable_if<!std::is_const<U>::value> = 0>
Scalar* operator[] (Index i) noexcept { return &m_ptr[i*m_cols]; }

template<class U = T, su2enable_if<!std::is_const<U>::value> = 0>
Scalar& operator() (Index i, Index j) noexcept { return m_ptr[i*m_cols + j]; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this detects whether a var on the receiving side is const and then uses the latter versions of [] and () operators?
Edit: Or the matrix view - object has to be created with this property already and then it decides and what to do? I guess the latter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm no..
If the matrix view object is used in a const context then it returns const refs to su2double. Otherwise it will "try" to return non const ref.
However, if the view was returned by a const container, it cannot do that. Hence the use of sfinae to disable the non const version.

template<class U = T, su2enable_if<!std::is_const<U>::value> = 0>
Scalar& operator() (Index i, Index j) noexcept { return m_ptr[i*m_cols + j]; }

friend CMatrixView operator+ (CMatrixView mv, Index incr) { return CMatrixView(mv[incr], mv.m_cols); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this allows for the +1 logic which is equivalent to (i,j) logic to allow for similar use as with pointers. This of course might break less code of people and their code out there when pulling this, so 👍 ... but on the other hand might give some false security that the underlying datatype is still the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the type is not really important in this case, just the interface. And there is too much of this +stuff, so less work for me xD.

Comment on lines +124 to +125
Matrix operator[] (Index i) noexcept { return Matrix(m_storage[i], m_innerSz); }
ConstMatrix operator[] (Index i) const noexcept { return ConstMatrix(m_storage[i], m_innerSz); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So C3DContainerDecorator owns the data and can return MatrixView objects that provide access to the 2D structure for one entry after the first layer (or the 3rd dimension to stay in this dimension language)?

Copy link
Member Author

@pcarruscag pcarruscag Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was influenced by Eigen (inception style) for the "view"

SU2_CFD/include/numerics/CNumerics.hpp Show resolved Hide resolved
const Vec2& var_i, const Vec2& var_j,
su2double* projNormal,
su2double* projCorrected) {
nDim = (nDim > 2)? 3 : 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is time for

template<class nDim> 
class SU2_CFD { }

:) because this looks really weird, jk

btw I guess this loop unrolling thing is sth you checked with compiler explorer?

Comment on lines -209 to +187
/*--- Compute vector going from iPoint to jPoint ---*/
auto proj_vector_ij = ComputeProjectedGradient(nDim, nVar, Normal, Coord_i, Coord_j, ConsVar_Grad_i,
ConsVar_Grad_j, correct, &Temp_i, &Temp_j,
NormalGrad, CorrectedGrad);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cut-paste-spezialize in action :)

Comment on lines +380 to +381
const auto Temp_i_Grad = nodes->GetGradient(iPoint);
const auto Temp_j_Grad = nodes->GetGradient(jPoint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now auto evaluates to the MatrixView type, right (that returns const values on its own)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Before it was a const pointer that could return non const

SU2_CFD/src/solvers/CHeatSolver.cpp Show resolved Hide resolved
@@ -1,2 +1,2 @@
"VARIABLE" , "AVG_DENSITY[0]", "AVG_ENTHALPY[0]", "AVG_NORMALVEL[0]", "DRAG[0]" , "EFFICIENCY[0]" , "FORCE_X[0]" , "FORCE_Y[0]" , "FORCE_Z[0]" , "LIFT[0]" , "MOMENT_X[0]" , "MOMENT_Y[0]" , "MOMENT_Z[0]" , "SIDEFORCE[0]" , "SURFACE_MACH[0]", "SURFACE_MASSFLOW[0]", "SURFACE_MOM_DISTORTION[0]", "SURFACE_PRESSURE_DROP[0]", "SURFACE_SECONDARY[0]", "SURFACE_SECOND_OVER_UNIFORM[0]", "SURFACE_STATIC_PRESSURE[0]", "SURFACE_STATIC_TEMPERATURE[0]", "SURFACE_TOTAL_PRESSURE[0]", "SURFACE_TOTAL_TEMPERATURE[0]", "SURFACE_UNIFORMITY[0]", "AVG_TEMPERATURE[1]", "MAXIMUM_HEATFLUX[1]", "TOTAL_HEATFLUX[1]", "FINDIFF_STEP"
0 , 0.0 , -99999.96982514858, 2.2204999999731917e-08, 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , -0.04999999997368221, -1.1100000001884e-08, -2.069999999187999 , 0.0 , 2.12000000054946 , 3.7100000016554446 , 330.00000030369847 , -39.999997625272954 , 315.0000011942211 , -40.000008993956726 , -1.400000004814217 , -149.9999996212864, 0.0 , -469.9999976764957, 1e-08
0 , 0.0 , -99999.96982514858, 3.330700000025998e-08, 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , 0.0 , -0.04999999997368221, -1.1100000001884e-08, -2.069999999187999 , 0.0 , 2.12000000054946 , 3.7100000016554446 , 330.00000030369847 , -39.999997625272954 , 315.0000011942211 , -40.000008993956726 , -1.400000004814217 , -149.9999996212864, 0.0 , -469.9999976764957, 1e-08
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔎 what's going on here? (breakfast 🐞 obviously, but jokes aside) was this a bug? I guess it is 1e-8 land ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My different handling of division by zero maybe, or the optimizability of 2 or 3

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give me place to look for the different handling of division by zero 🔬 (so I dont have to think for myself)

@@ -345,7 +345,7 @@ def main():
turb_naca0012_sst.cfg_dir = "rans/naca0012"
turb_naca0012_sst.cfg_file = "turb_NACA0012_sst.cfg"
turb_naca0012_sst.test_iter = 10
turb_naca0012_sst.test_vals = [-11.451011, -12.798258, -5.863895, 1.049989, 0.019163, -1.925028]
turb_naca0012_sst.test_vals = [-11.451010, -12.798258, -5.863895, 1.049989, 0.019163, -1.925018]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a bit of trouble in finding an explanation from the changes why these few have marginal differences ... because I would then expect more to fail ... but maybe you can shed some light on this if you have an idea

Co-authored-by: TobiKattmann <31306376+TobiKattmann@users.noreply.github.com>
dist_ij_2 += pow(edgeVec[iDim], 2);
proj_vector_ij += edgeVec[iDim] * normal[iDim];
}
proj_vector_ij /= max(dist_ij_2,EPS);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

div by 0

@pcarruscag pcarruscag merged commit 22bb669 into develop Jul 18, 2021
@pcarruscag pcarruscag deleted the gradient_type branch July 18, 2021 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants