Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -finit-local-zero flag on Melvin to force local var initialization #1713

Closed
wants to merge 1 commit into from

Conversation

jqyin
Copy link
Contributor

@jqyin jqyin commented Aug 14, 2017

Uninitialized local variables are not set to zero by default with gnu compiler
on Melvin.
Add -finit-local-zero flag so Melvin behaves similarly as other test machines
in term of local variable initialization.

[non-BFB]

@jqyin jqyin added Machine Files non-BFB PR makes roundoff changes to answers. labels Aug 14, 2017
@jqyin jqyin requested review from rljacob and bishtgautam August 14, 2017 16:22
Copy link
Contributor

@bishtgautam bishtgautam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@worleyph
Copy link
Contributor

I haven't been tracking this, but isn't the problem with having uninitialized variables (if they show up in output that then indicates a diff)? There are (or used to be) production systems and compilers that do not initialize variables to zero, and historically we have NOT used flags to set uninitialized variables to zero as this masks real errors.

@singhbalwinder
Copy link
Contributor

Hi @jqyin :
It may be just my personal preference, so you can ignore my suggestion if you like. If I am understanding this change correctly, I think we should not rely on compiler options to set variables to zero. Variables should be initialized explicitly to their desired values in the source code. I have seen uninitialized variables causing issues in the past and they are hard to trace sometimes. Unless we require users to use these compiler flags (setting uninitialized variables to zero) always, I think we should fix these in the code itself. It is also hard to rely on compilers to initialize variables to zero as I have seen cases where compilers has bugs and they don't play well.

Again, fixing it in the code would require us to know which variables are uninitialized in the code, which can be hard sometimes.

@worleyph
Copy link
Contributor

It may be just my personal preference, so you can ignore my suggestion if you like.

This is more than a personal preference issue. @rljacob , please weigh in. I would argue that this is a an ACME software engineering requirement - don't use compiler flags to initialize variables (unless they are initialized to NaNs or something similarly nasty).

@singhbalwinder
Copy link
Contributor

This is more than a personal preference issue.

Thanks @worleyph . I agree with you. We don't allow this in the atmospheric codes and try hard to fix these issues as soon as they are discovered. I wasn't sure about the land side. I agree that this should be a standard ACME policy to not to rely on compilers to take care of uninitialized variables.

@rljacob
Copy link
Member

rljacob commented Aug 14, 2017

Ok lets not do this.

Will Melvin continue to diff on these tests even after blessing because some variables have random stuff in them?

@jqyin
Copy link
Contributor Author

jqyin commented Aug 14, 2017

@worleyph and @singhbalwinder , I agree that the root cause is the code that has uninitialized local variables, and non-BFB is resulted from the rounding errors of undefined initial values. Currently, these issues are presented in BGC-related land code. Since most integrators don't have access to Melvin, it took a while to even realize this is the problem.

@rljacob , if we bless the diff to let the high priority PR go in, I can try to fixes this legacy issue. Until that, I think any BGC-related code changes will likely cause diff.

I'll close this PR.

@rljacob
Copy link
Member

rljacob commented Aug 14, 2017

Ok lets do that.

@jqyin jqyin closed this Aug 14, 2017
jgfouca pushed a commit that referenced this pull request Sep 15, 2017
refactored datamodels
refactored datamodels for new modularity

This PR contains a refactorization of the data models in order to separate out the driver specfic parts of dxxx_comp_mod.F90 (which now contains datatypes such as cdata, infodata, etc. that are specific to the current mct cpl7 driver) and permit new driver implementations (e.g. nuopc) to use the core parts of the data models. In addition, a new directory data_comps/dshare now contains data model share code that was previously in share/utils.

Test suite: scripts_regressions_tests
The following tests had baselines generated without the refactor and tests were
run to verify that the data model refactor maintained bfb answers

- ERI.f09_g17.ETEST.cheyenne_intel.cice-default
- ERI.f09_g17.ETEST.cheyenne_intel.cice-default
- ERI.f09_g17.ETEST.cheyenne_intel.cice-default
- ERI.T62_g16.C1850ECO.cheyenne_intel.pop-ecosys
- ERI.T62_g16.C.cheyenne_intel.pop-default
- ERI.T62_g16.CIAF.cheyenne_intel.pop-default
- ERI.T62_g16.GIAF.cheyenne_intel.pop-default
- ERI.T62_g17.DTEST.cheyenne_intel.cice-default
- ERI.T62_g17.G.cheyenne_gnu.pop-cice
- ERP_D_Ln9.f19_f19_mg17.QPC6.cheyenne_intel.cam-outfrq9s
- ERP_D_Ln9.f19_f19_mg17.QSC6.cheyenne_intel.cam-outfrq9s
- ERP_Ln9.f09_f09_mg17.F1850_DONOTUSE.cheyenne_intel.cam-outfrq9s
- ERP_Ln9.f09_f09_mg17.FHIST_DEV.cheyenne_intel.cam-outfrq9s
- ERP_Ln9.f19_f19_mg17.F2000_DEV.cheyenne_gnu.cam-outfrq9s
- ERR.f45_g37_rx1.A.cheyenne_intel
- ERS.f09_g16.ADLND.cheyenne_intel
- ERS_Ld3.f45_g37_rx1.A.cheyenne_intel
- ERS_Lm3.T62_g16.AIAF.cheyenne_intel
- ERS_Ly20_N2_P2.f09_g16_gl20.T1850G1.cheyenne_intel
- ERS_Ly3.f09_g16_gl4.T1850G.cheyenne_intel
- PEM.f19_g16_rx1.A.cheyenne_intel
- PEM.f19_g16_rx1.A.cheyenne_intel
- SMS.f09_g16.ADWAV.cheyenne_intel
Also ran most aux_clm5 tests and verified bfb with clm4_5_1_r251
Test baseline: see above
Test namelist changes:
Test status: none

Fixes #1713

User interface changes?: None

Update gh-pages html (Y/N)?: N

Code review: jedwards, rjacob
@jqyin jqyin deleted the jqyin/machinefiles/melvin-init0 branch February 5, 2018 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Machine Files non-BFB PR makes roundoff changes to answers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants