Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staging/UKMO Oct 2021 #499

Merged
merged 8 commits into from
Nov 5, 2021
Merged

Conversation

ukmo-ccbunney
Copy link
Collaborator

@ukmo-ccbunney ukmo-ccbunney commented Oct 20, 2021

Pull Request Summary

Staging branch comprising the following individual PRs:

Description

See individual PRs for full info.

Issue(s) addressed

Check list

  • Is your feature branch up to date with the authoritative repository (NOAA/develop)? YES

  • Please list appropriate labels code managers should add for this PR: bug

  • Reviewers: @aliabdolali

Commit Message

UKMO Staging Oct 2021:

Testing

  • How were these changes tested? Regtests, Local testing
  • Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Yes
  • If a new feature was added, was a new regression test added? No
  • Have regression tests been run? Yes
  • Which compiler / HPC you used to run the regression tests in the PR? Cray HPC; GNU Fortran

Expected changes:

  • ww3-tp2.15/ST[46]FLX5: Differences expected in the output point and grid files due to bug in W3SRCE call being fixed.
  • mww3_test_03: Known not B4B.
**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR2_UQ_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (9 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (9 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (6 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (9 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (9 files differ)
mww3_test_03/./work_PR2_UQ_MPI_d2                     (10 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e                     (1 files differ)
mww3_test_03/./work_PR3_UNO_MPI_e_c                     (1 files differ)
ww3_tp2.15/./work_ST4FLX5                     (5 files differ)
ww3_tp2.15/./work_ST6FLX5                     (5 files differ)

Regression test results:
matrixComp_staging_ukmo_oct2021.zip

@ukmo-ccbunney ukmo-ccbunney added the bug Something isn't working label Oct 20, 2021
This was referenced Oct 20, 2021
@aliabdolali
Copy link
Contributor

Hi @ukmo-ccbunney
i got failure in ww3_shel exection for the following tests:
run_test -b slurm -c hera.intel -S -T -s ST4 -i input_rho -w work_ST4FLX5 -o netcdf ../model ww3_tp2.15
run_test -b slurm -c hera.intel -S -T -s ST6 -i input_rho -w work_ST6FLX5 -o netcdf ../model ww3_tp2.15

  WAVEWATCH III calculating for 2014/03/10 00:00:00 UTC at 00:04:16
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
ww3_shel           00000000006815CD  Unknown               Unknown  Unknown
libpthread-2.17.s  00007F35F1DDC630  Unknown               Unknown  Unknown
ww3_shel           00000000004F6D1C  w3srcemd_mp_w3src        1302  w3srcemd.F90
ww3_shel           00000000004A13E0  w3wavemd_mp_w3wav        2689  w3wavemd.F90
ww3_shel           000000000040C60B  MAIN__                   2776  ww3_shel.F90
ww3_shel           000000000040349E  Unknown               Unknown  Unknown
libc-2.17.so       00007F35F1A21555  __libc_start_main     Unknown  Unknown
ww3_shel           00000000004033A9  Unknown               Unknown  Unknown

@ukmo-ccbunney
Copy link
Collaborator Author

Hi Ali,
It runs ok here with the Cray and GNU compiler. Let me check with the intel compiler and I will get back to you.
Chris.

@ukmo-ccbunney
Copy link
Collaborator Author

ukmo-ccbunney commented Nov 2, 2021

OK - so this is happening due to a divide by zero error in FLX5 that is happening when TAUA is zero.
Specifically line 189 of w3flx5md (UST is zero as TAUA is zero):

SQRTCDM1 = MIN(UNZ/UST,100.0)

@aliabdolali
Copy link
Contributor

OK - so this is happening due to a divide by zero error in FLX5 that is happening when TAUA is zero. Specifically line 189 of w3flx5md (UST is zero as TAUA is zero):

SQRTCDM1 = MIN(UNZ/UST,100.0)

@JessicaMeixner-NOAA this is similar to the issue in the ufs (divide by zero).

@ukmo-ccbunney
Copy link
Collaborator Author

ukmo-ccbunney commented Nov 2, 2021

@JessicaMeixner-NOAA this is similar to the issue in the ufs (divide by zero).

Yes - I was thinking that; I assume you put a sensible limiter on the relevant variables?
The friction velocity (UST) needs to be set to something non zero. If I run with this small modification:

UST    = MAX(1E-6, SQRT(TAUA/RHOAIR))

Then everything runs ok - I just need to ensure that 1e-16 is a sensible lower limit for UST...

Edit: Actually, I see @JessicaMeixner-NOAA checks for a non zero value on USTAR - perhaps that is a better approach here too?

@JessicaMeixner-NOAA
Copy link
Collaborator

Either I think works, I did find that the existing check in ST4 was too small, that we have quite a few cases where ust>0 but < 0.001, so 1E-6 is probably a better catch. (This is my branch with the fix that had the minimal impact in terms of changing answer in reg tests: https://github.com/JessicaMeixner-NOAA/WW3/tree/debug/ST4dividebyzerofix I hope to submit a PR with this fix tomorrow. I had to look at calculations/values of z0 as well as UST for this case).

@ukmo-ccbunney
Copy link
Collaborator Author

The discussion in #497 suggests that there is a wider problem with divide by zero errors w.r.t. USTAR which are probably beyond the scope of this PR.

I have made a fix to the FLX5 module so that the calculated USTAR value has a min of 1e-4, as suggested by @mickaelaccensi . This fixes the errors seen when running with the Intel compiler.

This does result in some small differences in the output of the ww3_tp2.15/FLX5 regtests, mainly at a few points around the coast where USTAR was probably calculated as zero before and is now 1e-4.

@aliabdolali - can you rerun your regression tests and see if this fixes the issue for you? Thanks.

@aliabdolali
Copy link
Contributor

The regtests all passed on NOAA RDHPC with intel compiler with the expected non-b4b tests:

**********************************************************************
********************* non-identical cases ****************************
**********************************************************************
mww3_test_03/./work_PR2_UQ_MPI_d2                     (6 files differ)
mww3_test_03/./work_PR2_UNO_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2_c                     (8 files differ)
mww3_test_03/./work_PR1_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2_c                     (8 files differ)
mww3_test_03/./work_PR3_UNO_MPI_d2                     (8 files differ)
mww3_test_03/./work_PR3_UQ_MPI_d2                     (8 files differ)
mww3_test_07/./work_PR3_UQ                     (3 files differ)
ww3_tp2.10/./work_MPI_OMPH                     (7 files differ)
ww3_tp2.16/./work_MPI_OMPH                     (5 files differ)
ww3_ufs1.1/./work_d                     (0 files differ)
ww3_ufs1.2/./work_b                     (0 files differ)
ww3_ufs1.3/./work_a                     (1 files differ)

and due to this dev

ww3_tp2.15/./work_ST6FLX5                     (5 files differ)
ww3_tp2.15/./work_ST4FLX5                     (5 files differ)

matrixCompFull.txt
matrixCompSummary.txt
matrixDiff.txt

I'll create an issue to address a consistent minimum USTAR for each source term.

@aliabdolali aliabdolali merged commit ad51140 into NOAA-EMC:develop Nov 5, 2021
JessicaMeixner-NOAA pushed a commit that referenced this pull request Nov 17, 2021
Bugfix/FLX5 Tau 
Small bugfix to variable INTENT
@ukmo-ccbunney ukmo-ccbunney deleted the staging/ukmo_oct2021 branch April 28, 2023 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[w3ounfmeta] INTENT mismatch
3 participants