Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to -fp-model fast in standalone HOMME runs #1960

Merged
merged 3 commits into from
Dec 20, 2017
Merged

Conversation

amametjanov
Copy link
Member

Switch to -fp-model fast in standalone HOMME runs

[non-BFB] - for the HOMME test due to cflag change

@amametjanov amametjanov added HOMME non-BFB PR makes roundoff changes to answers. labels Dec 6, 2017
@mt5555
Copy link
Contributor

mt5555 commented Dec 7, 2017

will this be ok on KNL? @ndkeen is seeing reproducibility problems on KNL with -fp-model fast for EAM.

@mt5555
Copy link
Contributor

mt5555 commented Dec 7, 2017

with -fp-mode=fast:

skybridge, with intel16: HOMME tests pass
anvil, intel17: HOMME tests pass
cork-knl, intel18: HOMME tests pass, but had to increase swtc6 tolerance to 5e-14

So this looks good to go (after I push the update to swtc6 tol).

@amametjanov
Copy link
Member Author

amametjanov commented Dec 7, 2017

Checked for reproducibility for intel18 on cori-knl with baselines generated from git master v1.0.0-beta.2-2897-g82cf067 (+ -fp-model fast):

azamat@cori11:/global/cscratch1/sd/azamat/acme_scratch/cori-knl/HOMME_P24.f19_g16_rx1.A.cori-knl_intel18.C.20171207_111804_ne1i1z/run> 
cat homme.log
[  2%] Built target timing
[  8%] Built target pio
[ 20%] Built target swtcA
[ 28%] Built target swtcB
[ 42%] Built target baroC
[ 57%] Built target theta-nlev30
[ 71%] Built target baroCam
[ 85%] Built target baroCam-acc
[100%] Built target theta-nlev20
Scanning dependencies of target check
Test project /global/cscratch1/sd/azamat/acme_scratch/cori-knl/HOMME_P24.f19_g16_rx1.A.cori-knl_intel18.C.20171207_111804_ne1i1z/bld/te
st_execs
      Start  1: verifyBaselineResults
 1/17 Test  #1: verifyBaselineResults ............   Passed    5.46 sec
      Start  2: swtc1
 2/17 Test  #2: swtc1 ............................   Passed   33.86 sec
      Start  3: swtc2
 3/17 Test  #3: swtc2 ............................   Passed   21.43 sec
      Start  4: swtc5
 4/17 Test  #4: swtc5 ............................   Passed  101.60 sec
      Start  5: swtc6
 5/17 Test  #5: swtc6 ............................   Passed   30.23 sec
      Start  6: baro2b
 6/17 Test  #6: baro2b ...........................   Passed  130.89 sec
      Start  7: baro2c
 7/17 Test  #7: baro2c ...........................   Passed   52.31 sec
      Start  8: baro2d
 8/17 Test  #8: baro2d ...........................   Passed  172.19 sec
      Start  9: baroCamMoist
 9/17 Test  #9: baroCamMoist .....................   Passed  102.46 sec
      Start 10: baroCamMoistSL
10/17 Test #10: baroCamMoistSL ...................   Passed   22.72 sec
      Start 11: baroCamMoist-acc
11/17 Test #11: baroCamMoist-acc .................   Passed   57.80 sec
      Start 12: thetah-test22
12/17 Test #12: thetah-test22 ....................   Passed   52.14 sec
      Start 13: thetanh-test22
13/17 Test #13: thetanh-test22 ...................   Passed   48.77 sec
      Start 14: thetah-TC
14/17 Test #14: thetah-TC ........................   Passed   16.88 sec
      Start 15: thetanh-TC
15/17 Test #15: thetanh-TC .......................   Passed   30.83 sec
      Start 16: thetanhwet-TC
16/17 Test #16: thetanhwet-TC ....................   Passed   31.47 sec
      Start 17: templates
17/17 Test #17: templates ........................   Passed   22.97 sec

100% tests passed, 0 tests failed out of 17

Total Test time (real) = 937.85 sec
[100%] Built target check

@mt5555
Copy link
Contributor

mt5555 commented Dec 7, 2017

Here's the issue I saw with swtc6:

5/17 Test #5: swtc6 ............................***Failed 37.15 sec
Submitting 1 jobs
Running test swtc6-run ... /global/homes/t/taylorm/scratch2/regtest/homme/tests/swtc6/swtc6-run.sh > swtc6-run.out 2> swtc6-run.err
test swtc6-run was run successfully
Test name = swtc6
Examining cprnc reference comparison output files
file = exodus-swtc61.nc
The files are different: DIFF_RESULT=DIFFERENT
Checking RMS differences with tol = 1E-14
CPRNC returned the following RMS differences
RMS geop 1.7501E-12 NORMALIZED 1.9114E-16
RMS u 1.7622E-14 NORMALIZED 5.4737E-16
RMS v 2.4322E-14 NORMALIZED 1.1928E-15
RMS geop 4.6121E-12 NORMALIZED 5.0375E-16
RMS u 1.1562E-13 NORMALIZED 3.5950E-15
RMS v 1.0836E-13 NORMALIZED 5.2769E-15
RMS geop 6.4231E-12 NORMALIZED 7.0133E-16
RMS u 2.1307E-13 NORMALIZED 6.6225E-15
RMS v 2.1734E-13 NORMALIZED 1.0649E-14
1.9114E-16 <= 1E-14 OK
5.4737E-16 <= 1E-14 OK
1.1928E-15 <= 1E-14 OK
5.0375E-16 <= 1E-14 OK
3.5950E-15 <= 1E-14 OK
5.2769E-15 <= 1E-14 OK
7.0133E-16 <= 1E-14 OK
6.6225E-15 <= 1E-14 OK
1.0649E-14 > 1E-14 ERROR: TOL EXCEEDED

@amametjanov
Copy link
Member Author

+1 on increasing the tol

mt5555 added a commit that referenced this pull request Dec 14, 2017
Switch to -fp-model fast in standalone HOMME runs

[non-BFB] - for the HOMME test due to cflag change
Test system (sandiatoss3) was overriding all the compiler flags, meaning that it
was not testing the changes in this PR.

tweak machine file so sandiatoss3 systems will use default compiler flags.
mt5555 added a commit that referenced this pull request Dec 15, 2017
@mt5555
Copy link
Contributor

mt5555 commented Dec 15, 2017

Merged to next - but all tests passed on skybridge because skybridge uses custom, machine specific fortran flags. I modified the branch so that skybridge will use the default Intel settings, and re-merged to next. HOMME test should now diff tomorrow.

@mt5555 mt5555 merged commit b78895f into master Dec 20, 2017
mt5555 added a commit that referenced this pull request Dec 20, 2017
Switch to -fp-model fast in standalone HOMME runs

[non-BFB] - for the HOMME test due to cflag change
@amametjanov amametjanov deleted the azamat/homme/fp-fast branch December 21, 2017 17:51
jgfouca pushed a commit that referenced this pull request Jan 23, 2018
Switch to -fp-model fast in standalone HOMME runs

[non-BFB] - for the HOMME test due to cflag change
rljacob pushed a commit that referenced this pull request Apr 21, 2021
Switch to -fp-model fast in standalone HOMME runs

[non-BFB] - for the HOMME test due to cflag change
rljacob pushed a commit that referenced this pull request May 6, 2021
…rtens/fix_datacomps

This fixes issue #1960 that arose when a stream domain file is not euqla to the model domain
rljacob pushed a commit that referenced this pull request May 6, 2021
Fix several data component issues
This PR fixes several outstanding issues:

src/components/data_comps/dshare was moved to src/share/streams in order
to resolved issue #1802.

For aquaplanet runs - the landmask was reset in docn_comp_mod.F90
and in fact this does work if you are reading an input sst file as the
aquaplanet forcing. The right place to put this is in
shr_strdata_init.F90right after the model grid is read in. An
optional argument was introduced in shr_strdata_init.F90
(reset_domain_mask) that will allow this in a backwards compatible manner.
This resolves issue #1960.

If the data component domain file is equal to 'UNSET' the 'domainfile' namelist
variable should be set to'null'which will assume that the model domain is contained
in the first stream file. This resolves issue #1937.

Test suite: scripts_regressions_tests
The following extra tests were also run on cheyenne and compared to cesm2_0_alpha07f:

ERI.T62_g16.C1850ECO.cheyenne_intel.pop-ecosys
ERI.T62_g37.G.cheyenne_intel.pop-cice
ERP_D_Ln9.f19_f19_mg17.QPC6.cheyenne_intel.cam-outfrq9s
ERP_D_Ln9.f19_f19_mg17.QSC6.cheyenne_intel.cam-outfrq9s
ERP_Ln9.f09_f09_mg17.F1850_DONOTUSE.cheyenne_intel.cam-outfrq9s
ERP_P180x2_D_Ld5.f19_g17_gl4.I1850Clm50BgcCropG.cheyenne_intel.clm-default
ERR.f45_g37_rx1.A.cheyenne_intel
ERS_IOP.T62_g16.CIAF.cheyenne_intel.pop-default
ERS_IOP.T62_g16.GIAF.cheyenne_intel.pop-default
ERS_Lm3.T62_g16.AIAF.cheyenne_intel
ERS_Ly3.f09_g16_gl4.T1850G.cheyenne_intel
Test baseline: cesm2_0_alpha07b
Test namelist changes: none
Test status: bit for bit

Fixes #1802
Fixes #1960
Fixes #1937

User interface changes?: None
Update gh-pages html (Y/N)?:N
Code review:sacks, edwards
rljacob pushed a commit that referenced this pull request May 6, 2021
Switch to -fp-model fast in standalone HOMME runs

[non-BFB] - for the HOMME test due to cflag change
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HOMME non-BFB PR makes roundoff changes to answers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants