Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Nonlinear Iteration Controller for Improved Fluid Solver Robustness & Convergence #790

Merged
merged 82 commits into from
Nov 28, 2019

Conversation

economon
Copy link
Member

@economon economon commented Sep 14, 2019

Proposed Changes

Give a brief overview of your contribution here in a few sentences.

This PR greatly improves the robustness and convergence behavior of the flow solvers by implementing a more sophisticated controller for nonlinear iterations along with several other important modifications.

The changes are the following:

  • New under-relaxation strategy. Variables such as density and energy are only allowed to change by +/- 20% with each nonlinear iteration by applying an under-relaxation parameter to the updates at each node, which helps with difficult transients, especially at calculation startup. Also applied to turbulence model updates (though threshold is larger).
  • CFL adaption via an exponential progression with under-relaxation approach, i.e., changes to CFL are coupled to the under-relaxation parameter (small under-relaxation parameter = decrease, and large = increase). CFL is now locally defined for each CV for this process.
  • Modified realizability checks to avoid negative density/pressure/temperature at the nodes and also during 2nd-order reconstruction at cell faces.
  • CV face states for 2nd-order MUSCL are now optionally reconstructed with a separate gradient method (unweighted least squares recommended), which is very important for robustness. Gradients for viscous terms and sources are still computed via GG or WLS for accuracy reasons.
  • SA model was being clipped unnecessarily. This clipping has been removed.

Here is my recommendation on how to run all of your 2nd-order cases going forward:

Euler / Laminar N-S

Turn off the limiters, unless you truly need them for preserving solution monotonicity (e.g., for shocks/discontinuities). Use

NUM_METHOD_GRAD_RECON= LEAST_SQUARES

for robustness. Start with a CFL around 10 and converge the implicit system with LINEAR_SOLVER_ITER= 10 for each nonlinear iteration (FGMRES + ILU0 recommended). Activate the CFL adaption with

% Parameters of the adaptive CFL number (factor down, factor up, CFL min value,
%                                        CFL max value )
CFL_ADAPT= YES
CFL_ADAPT_PARAM= ( 0.1, 2.0, 10.0, 1e10)

It is important that the linear solve behaves well for each nonlinear iteration, so if this does not converge immediately, adjust the tolerance or number of linear iterations. Roe seems very stable, but some of the other upwind schemes or centered schemes may require improved Jacobian accuracy (to recover fast convergence) with:

USE_ACCURATE_FLUX_JACOBIANS= YES
CENTRAL_JACOBIAN_FIX_FACTOR= 4.0

RANS

Due to the segregated approach for the turbulence models, you have to be more conservative with the CFL adaption. All else is the same as above, but I recommend

% Parameters of the adaptive CFL number (factor down, factor up, CFL min value,
%                                        CFL max value )
CFL_ADAPT= YES
CFL_ADAPT_PARAM= ( 0.1, 1.2, 10.0, 1e3)

Most inviscid/laminar cases that I have tested so far converge to machine precision within a couple of hundred iterations. RANS cases within a few thousand. Ultra fine RANS meshes can still slow things down (MG could help there), but otherwise, speedups of an order of magnitude (or more) are possible for most problems. Experiment yourselves.. sometimes you can get even more performance, sometimes less, case depending.

I could use help testing this one with your cases, since I think almost all regression tests will change. As a start, I have run all of the NASA TMR 2D bump-in-channel cases. Everything looks good with that (see figures). I have also checked that the inviscid and laminar MMS give 2nd-order accuracy properly.

bump_cd_gridconv
bump_cf_0p63_gridconv
bump_cf_0p75_gridconv
bump_cf_0p87_gridconv
bump_cl_gridconv
residual_convergence

Related Work

Resolve any issues (bug fix or feature request), note any related PRs, or mention interactions with the work of others, if any.

Gradient method strategy in #789

PR Checklist

Put an X by all that apply. You can fill this out after submitting the PR. If you have any questions, don't hesitate to ask! We want to help. These are a guide for you to know what the reviewers will be looking for in your contribution.

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags).
  • My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.

@jayantmukho
Copy link
Contributor

YES! I am excited to try this out. I can probably test it on some of the other TMR cases (airfoils, flatplates). Will post the results when I get those done.

Side note, there was one issue that @bmunguia and I encountered when performing optimizations with adaptive CFL. Say the DIRECT simulation is run with adaptive CFL and is well converged (6 to 8 orders of residual reduction). When the discrete adjoint performs the one direct iteration to store the computational graph, it uses the initial CFL value, not the CFL that the adaptive CFL routine ended at. This results in the residuals being very high for that one iteration, which then affects the convergence of the discrete adjoint.

I will try to run an adjoint on one of these cases as well to see if the problem persists. Perhaps could be overcome with a simple additional field for CFL in the restart meta-data

Copy link
Member

@pcarruscag pcarruscag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that two gradients are always computed per iteration?
Is there any downside to using the unweighted LS for viscous flux correction? Is the statement that this type of gradient is better for reconstruction based on your observations or is it one of those well known things?

Sometimes high CFL leads to limit-cycle oscillations of the residuals and the solution is to reduce it, is this something this controller can pick up?
High CFL also makes the linear systems harder to solve and as Edwin pointed out somewhere there is not much point going above reasonable values with weakly coupled turbulence. Do you think it would be reasonable to build in some feedback from the linear solver (e.g. it is taking too much time or did not converge -> drop the CFL)?

@economon
Copy link
Member Author

Do I understand correctly that two gradients are always computed per iteration?
Is there any downside to using the unweighted LS for viscous flux correction? Is the statement that this type of gradient is better for reconstruction based on your observations or is it one of those well known things?

Yes - the gradient for now is computed twice and stored separately for viscous flows with 2nd-order upwind. Could be combined into one loop eventually.

It is known that weighted LSQ / GG is more accurate (see Mavriplis, "Revisiting the Least-Squares Procedure for Gradient Reconstruction on Unstructured Meshes" for instance). However, unweighted LSQ is known to be more robust.. so a good compromise is to use it only for the reconstruction step (which is more susceptible to robustness issues than the viscous term) and then use WLSQ or GG for all other gradients in the viscous flux/sources for accuracy.

Sometimes high CFL leads to limit-cycle oscillations of the residuals and the solution is to reduce it, is this something this controller can pick up?
High CFL also makes the linear systems harder to solve and as Edwin pointed out somewhere there is not much point going above reasonable values with weakly coupled turbulence. Do you think it would be reasonable to build in some feedback from the linear solver (e.g. it is taking too much time or did not converge -> drop the CFL)?

Yes, I would also like to couple it to the linear solver so that we can remove the need to tune that as well. Ideally the user will not need to adjust parameters. There are some things I am going to try for that..

Copy link
Member

@pcarruscag pcarruscag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been through the paper and I can see why unweighted LS is more robust. It massively underpredicts gradients in curved boundary layer regions thereby making the schemes locally first order. Mavripilis shows how this makes the use of entropy correction very problematic, i.e. destroys the solution accuracy for even modest values.
Therefore I do not agree with forcing its use on people. Computing and storing a second (innacurate) gradient seems like a very high cost to something that could be achieved by a geometry based limiter (or limiter modification), the metrics you introduced should make that very simple.
With that said I would not object to having it as a third option, but used for both convective and diffusive fluxes.

@economon
Copy link
Member Author

@pcarruscag: the approach to use different gradient methods for the convective and viscous terms is relatively common across codes. If it is a computational cost concern, this can be addressed by fusing the kernels.

However, the bigger issue at hand is the other comment that is made in the article about the choice between LSQ and WLSQ/GG for 2nd-order upwind reconstructions: while less accurate, the former is much more robust, while the latter two usually require limiters just to obtain a stable solution on stretched RANS-type meshes, even for flows where we do not expect shocks/discontinuities. The problem with always requiring a limiter is that they stall convergence due to chatter, which apart from the obvious problems, also causes issues for the adjoint. I think everyone has experienced this.

Hence the compromise to use both gradients as I note above, which is also stated on p. 9 of Anderson and Bonhaus "An Implicit Upwind Algorithm for Computing Turbulent Flows on Unstructured Grids." In that paper they mention an additional interesting point that, in their numerical tests, LSQ outperforms WLSQ/GG for reconstructing nonlinear data at interfaces on highly-stretched meshes. Note the WLSQ/GG gradients are used for the viscous terms, which is important for accuracy. This type of approach is still applied in FUN3D.

If there are any accuracy concerns, we can also address those via our typical verification tools. I agree that we should continue along the path of looking at grid quality issues and how they impact the numerics (potentially adding some grid-based corrections / limiting), but I think the proposed approach in this PR will serve us well.

@pcarruscag
Copy link
Member

Regardless of fusing routines computing two is more expensive than computing one, and the method is full of drawbacks, so I do not think it should be a forceful default.
Can this be implemented as a USE_ROBUST_GRADIENT option that if set to NO uses whatever gradient method is chosen for both convection and diffusion, computing and allocating only once? I do not care what the default is I just want to be able to turn it off.

@EduardoMolina
Copy link
Contributor

Hi @economon ,

Thanks for the effort in implementing this PR that is focus in robustness and convergence.

However, I agree with @pcarruscag that we need to give to the user the ability to choose the gradient method that he/she thinks is appropriate. Although, this PR sounds promising we didn't test in complex geometries, i.e. High Lift PW or Drag PW (please correct me if I am wrong). Moreover, it may well be that in some cases the use of the more robust gradient method is not necessary since the CFL number is small for accuracy reasons, i.e. hybrid RANS/LES simulations like DDES.

Thanks again,
Eduardo

@economon
Copy link
Member Author

The motivation of having it as the default was to make the code as user-friendly as possible (fewer knobs exposed in the config), but options are good of course.

I would propose then that we add an option for the reconstruction gradient, something like:

NUM_METHOD_GRAD_RECON= LEAST_SQUARES

to let users decide if they want a separate option for the reconstruction gradients. If it does not appear, then the default is to use the same method as defined by NUM_METHOD_GRAD without a second gradient computation (basically what we have now). The nice thing about that is we can even try out other combos such as WLS+GG for the two different gradients. I will throw an error if users try to use LSQ for the viscous/source gradients, to avoid accuracy issues.

What do you think?

@pcarruscag
Copy link
Member

I agree with having suitable defaults that make the code robust, but "expert" options should still be available.
Having separate controls also sounds good, I would not throw an error though, maybe just a warning.

If you detect the two gradient options to be the same the associated CVariable classes make the reconstruction gradient point to the primitive gradient instead of allocating, and the call to compute the reconstruction gradient is skipped. If this is what you have in mind I support 100%.

…nt for scalar equations. Adding checks for linear and nonlinear residual reductions in CFL adaption which need testing.
@koodlyakshay
Copy link
Member

Hi @economon ,

I tested a couple of cases - laminar flow over a cylinder and turbulent flat plate and the convergence is greatly enhanced. I have attached a plot of convergence history for the flat plate case. I didn't have much luck with the laminar backward facing step case though. The residuals tends to oscillate around -5.5. I was planning to run the NACA 0012 test case soon. Do you have any cases specifically that you wanted to test? I can run some of them soon.
Convergence

Cheers,
Akshay

@economon
Copy link
Member Author

I have added the new option NUM_METHOD_GRAD_RECON to specify a separate method for computing the reconstruction gradient. If that option is not present, then no additional memory is allocated and no extra gradient computation occurs.

I have also put in simple feedback from the linear solver residual and the nonlinear residual to the nonlinear controller. If the linear system converges less than a half an order of magnitude, then the CFL is lowered. A Cauchy-like criteria checks for stall in the nonlinear residuals and drops the CFL to the minimum floor to kick the solver out of a rut. Both of these use factors that are empirical from my tests. Will probably be improved with time and more testing, but they do seem to improve behavior.

@koodlyakshay : I had success with the inc. laminar backward facing step after adding extra iterations to the linear solve. For some cases, this is necessary to get a large speedup. I am seeing good speedup for most of the cases within our TestCases repo. If you have some tough cases not covered by the repo, please give those a try.

@pcarruscag pcarruscag dismissed their stale review September 30, 2019 10:03

Comments addressed

@economon
Copy link
Member Author

Nice work, and thanks for sharing, @jayantmukho. I expect you'll add everything to the V&V repo and eventually the V&V tab once we understand what is happening?

The GG results look great, potentially even better than the previous set of results we had for the DSMA case. I am also surprised to see the behavior of the GG+LSQ.. the finest mesh seems to be especially errant. I have been running the NACA 0012 case, and I also see that the results with pure GG or WLSQ are slightly better there, but not this drastic.

I think we still need to dial in the LSQ and make sure we do not have any bugs. Although, the flat plate and bump-in-channel cases were run with LSQ and gave very good results. It could be that too much curvature in the grid, especially near walls (which is known to be a potential problem), is causing these issues for the LSQ accuracy, but I'm surprised it would be that significant. Still looking into this...

As for adaptive CFL, sometimes I find that just turning it off for some rans cases and using a fixed 250, 500, or 1000 works best. Does that work for you with these cases with GG or WLSQ, @jayantmukho ?

@jayantmukho
Copy link
Contributor

jayantmukho commented Nov 13, 2019

  • The meshes are available here: https://github.com/su2code/VandV/tree/master/rans/dsma661

  • The config files for the GG + LSQ results are here: dsma661_configs.zip

  • I did not run the other solvers myself. Those are results from the NASA TMR website. I don't have access to those solvers unfortunately.

  • Yeah the non-monotonic variation worried me as well. But all the simulations were converged to a density residual of -13. (all of them have over 6 orders of residual reduction). The residuals for the GG+LSQ results shown here:
    res_SA
    I tried a few re-runs of that finest mesh with a couple of different options. I reduced the maximum CFL to ~30 from 1000 and it still gave the same result. I reduced the number of cores I was running on from 80 to 20 and that had no effect either. As soon as I ran without the NUM_METHOD_GRAD_RECON= LEAST_SQUARES option, it got a C_L of about 0.159402 which would be more in line with the other solvers.

@economon I haven't tried a high fixed CFL. Let me check that behavior and report back.

@jayantmukho
Copy link
Contributor

@economon The behavior without adaptive CFL is similar to its non-adaptive behavior. I cannot increase the CFL greater than a certain value and get convergence. For example, for the 2nd coarsest mesh (297 x 57 or 129) I cannot increase the CFL > 20. This is regardless of weather I use adaptive CFL or not. I cannot set the CFL higher than 20.

But this limit increases slightly for the finer meshes. For example, for the finest mesh I can push the CFL up to 30 (adaptive or otherwise). Basically cannot get high CFLs for the GG reconstruction. I should also mention that all these runs are without slope limiters to get the most accurate solutions.

@economon
Copy link
Member Author

To prepare for v7, I have now updated most of the tutorials to take advantage of the new features in this PR. I have also completed a sweep through all regression tests using this branch and have verified that the tests are ok. So, from my point of view, this is ready to be merged, other than updating the final residuals for a few tests which are pending.

I will keep looking into the LSQ issues, so for now, we should be cautious with that option.

@pcarruscag pcarruscag mentioned this pull request Nov 25, 2019
4 tasks
@economon
Copy link
Member Author

Ok, it's finally time to get this one merged! Thanks all for the comments and help.

I'll merge this one in now as-is. I will leave the Tutorials branch in the regressions script set to the 'release_v7.0.0' branch, since we are collecting all changes to the website there for the release. @talbring @rsanfer : can you please revert that back to develop in the Release v7 PR when you are ready (after releasing the web updates)?

@economon economon merged commit 69b9988 into develop Nov 28, 2019
@economon economon deleted the feature_nlctrl branch November 28, 2019 07:13
@jayantmukho jayantmukho mentioned this pull request Jul 2, 2020
5 tasks
clarkpede added a commit to pecos-hybrid/SU2 that referenced this pull request Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants