Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report missing variables in cprnc #2988

Merged
merged 15 commits into from
Feb 11, 2019

Conversation

billsacks
Copy link
Member

@billsacks billsacks commented Jan 24, 2019

This PR changes cprnc and the test system so that, if variables are
missing from one of the files (the baseline file or the new file), this
will be reported as a baseline failure, like this:

FAIL SMS_D_Ld3.f10_f10_musgs.I1850Clm50BgcCrop.cheyenne_intel.clm-default BASELINE ctsm1.0.dev025: FIELDLIST field lists differ

For cprnc itself, this changes the final status of cprnc so that, if
variables are missing from one of the files, the final status is not
"IDENTICAL": this final status will now look like:

diff_test: the two files DIFFER only in their field lists

In addition, the number of variables present in file2 but not file1 is
now reported in the final counts.

If there are both missing variables and differences in common variables,
cprnc's final result is the same as before - reporting a DIFF in this
case. i.e., this new result is only reported as the final status if all
common variables are identical: a real DIFF takes precedence over a
field list difference.

This PR also changes the python scripts so that they respond
appropriately to this new possible cprnc result: During baseline
comparisons, if the only differences are in field lists (no differences
in the actual values of variables), then the baseline phase will still
report a FAIL, but a comment will be added with a keyword, FIELDLIST and
a bit of description (similarly to what we do for BFAILs), facilitating
filtering out these FIELDLIST differences if they were expected. As with
the behavior of cprnc itself, FIELDLIST is only reported if the only
differences are in field lists: if there are any real differences, then
the output is the same as before (so that you won't miss any real
differences if you do something like grep -v FIELDLIST). (If there are
both FIELDLIST and BFAIL results for a given test, but no actual
differences in values, then we report MULTIPLE, so that neither grep -v FIELDLIST nor grep -v BFAIL would accidentally exclude the result.)

Test suite:

  • scripts_regression_tests on cheyenne with both an old and new build
    of cprnc (ran on 63377a8)
  • manual tests of a test with baseline comparison with fieldlist diffs,
    and a test with both fieldlist and real diffs: checked TestStatus.log
    and TestStatus in these cases
  • tests in tools/cprnc (via run_tests in that directory, with baseline
    comparisons against master version): ensured that the only diffs are
    as expected, and examined output of the new tests there (side-note
    about the new attribute-comparison test: attribute comparisons are
    currently asymmetrical: cprnc currently doesn't detect attributes
    present on file2 not on file1)

Test baseline: For cprnc's run_tests: 5fa1832
Test namelist changes: none
Test status: bit for bit

Fixes #870

User interface changes?: none

Update gh-pages html (Y/N)?: N

Code review:

Test suite: None
Test baseline: N/A
Test namelist changes: N/A
Test status: N/A

Fixes: None

User interface changes?: No

Code review: None
Test suite: None
Test baseline: N/A
Test namelist changes: N/A
Test status: N/A

Fixes: None

User interface changes?: No

Code review: None
Test suite: cprnc run_tests
Test baseline: N/A
Test namelist changes: N/A
Test status: N/A

Fixes: None

User interface changes?: No

Code review: None
This will make it easier to add the new feature

Test suite: cprnc run tests
Test baseline: previous commit
Test namelist changes: N/A
Test status: all identical

Fixes: None

User interface changes?: No

Code review: None
Test suite: cprnc run_tests
Test baseline: previous commit
Test namelist changes: N/A
Test status: identical except for the missing_variable test

Fixes: None

User interface changes?: No

Code review: None
This method will be easier to extend to variables not found on file1

Test suite: cprnc run_tests
Test baseline: previous commit
Test namelist changes: N/A
Test status: all identical

Fixes: None

User interface changes?: No

Code review: None
For now, this outputs 0

Test suite: cprnc run_tests
Test baseline: previous commit
Test namelist changes: N/A
Test status: identical except for new line

Fixes: None

User interface changes?: No

Code review: None
Test suite: cprnc run_tests
Test baseline: previous commit
Test namelist changes: N/A
Test status: identical except for expected diffs in
diffs_in_vals_and_extra_and_missing.nc.out and extra_variable.nc.out

Fixes: None

User interface changes?: No

Code review: None
This better tests the accumulation code

Test suite: cprnc run_tests
Test baseline: previous commit
Test namelist changes: N/A
Test status: identical except for expected diffs in two tests

Fixes: None

User interface changes?: No

Code review: None
Test suite: None
Test baseline: N/A
Test namelist changes: N/A
Test status: N/A

Fixes: None

User interface changes?: No

Code review: None
…list_change

This brings in some additional cprnc tests

These changes are unrelated to the other changes on the branch, but I am
including them for convenience, because both involve increasing the test
coverage of the cprnc run_tests
Resolved Conflicts:
	tools/cprnc/README
Make it so differences in field lists will be reported specially in
TestStatus.log and TestStatus.
Fail if we can't find one of the expected strings in the cprnc
output. Point is: if the cprnc Fortran code is changed to change the
format of one of these strings without corresponding changes in the
python, we want to fail rather than silently doing the wrong thing.

Also refactor this logic a bit.
@billsacks
Copy link
Member Author

Some additional notes for reviewers:

First: I liberally added a number of reviewers here. If you're not
interested in reviewing this, feel free to remove yourself from the
list.

This PR was originally opened nearly 3 years ago here
CESM-Development#442, but was never
reviewed there. I have now merged this up to the latest version of
master and added code to the scripts to respond appropriately to the new
cprnc behavior.

I am not personally attached to the behavior change in this PR: I'm
ambivalent about whether a change in the field list should be reported
as a difference, with my concern being that this is going to lead to
differences being reported quite frequently, when there are expected
additions or removals of diagnostic fields. People will need to get used
to filtering out these expected differences. However, when I first
implemented this, it was at the request of a number of CSEG members
(chiefly @gold2718 and @mnlevy1981): the general agreement in CSEG at
the time was that this change was desirable. If people now feel
differently, I'm fine with this PR being closed (though I'll probably
try to salvage some of the mostly-unrelated pieces, like the added cprnc
tests).

If we do go with this behavior change, I'm not tied to the details of
how these differences are reported, and am open to suggestions here.

I'm not sure if we should have a new fake test type that triggers this
behavior, similar to TESTRUNDIFF? I'm not sure exactly how to do this,
but I think we could add a test to scripts_regression_tests vaguely
similar to test_bless_test_results (but probably only doing a subset
of what's done there). However, even if this is worth doing, it can't be
done until all of the cime test machines have updated versions of
cprnc. (If we wanted to go crazy, we could also test various
combinations of results: e.g., DIFF in one file and FIELDLIST in
another, and mixing BFAILs into this too... but I feel that might be
more trouble than it's worth, given that those combos are already tested
via doctests of get_ts_synopsis.)

This should be backwards compatible with old versions of cprnc:

  • If using new scripts with old cprnc build: the new string will never
    appear in cprnc output; fieldlist diffs will be treated as a pass, as
    they have been until now
  • Conversely, if using old scripts with new cprnc build: If there are
    fieldlist diffs, this will be reported the same as a true diff: will
    trigger a FAILed BASELINE result, and won't have the FIELDLIST
    comment; this seems acceptable to me

Copy link
Contributor

@jgfouca jgfouca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for the good documentation and testing!

@jedwards4b jedwards4b assigned billsacks and unassigned jedwards4b Jan 24, 2019
@ekluzek
Copy link
Contributor

ekluzek commented Jan 24, 2019

This would often be a useful feature. Most often you'll know about these changes, so can go through and figure it out.

@jedwards4b
Copy link
Contributor

@billsacks I've removed myself as assignee since I am headed out on travel today.

@billsacks billsacks requested a review from rljacob January 30, 2019 21:27
@billsacks
Copy link
Member Author

Adding @rljacob as a reviewer. He's going to run this by the E3SM integrators to see if they object to this change. (The general feeling on today's cime telecon is that this is a good idea.)

@mnlevy1981 mnlevy1981 removed their request for review February 8, 2019 16:11
@mnlevy1981
Copy link
Contributor

I'm happy to hear these changes are going in, and I like everything mentioned in the first comment about the PR, but I don't have time to go through the code and offer any specific thoughts so I've removed myself from the reviewer list. Thanks from incorporating these changes, though!

Copy link
Member

@rljacob rljacob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E3SM approves.

@jedwards4b jedwards4b merged commit 51ab1c1 into ESMCI:master Feb 11, 2019
@billsacks billsacks deleted the cprnc_report_fieldlist_change branch February 12, 2019 15:48
jedwards4b added a commit that referenced this pull request Mar 28, 2019
cprnc: allow differences in field lists for time-constant fields
In cprnc: For files with an unlimited (time) dimension: Separately count
(1) missing time-varying fields and (2) missing time-constant
fields. Only (1) is considered in determining whether to report a final
difference in the field lists.

Before this, no distinction was made between time-varying
vs. time-constant fields in counting the number of missing
variables. (These counts were added in #2988). However, that led to
failures in some exact restart tests, because some time-constant fields
were on output files from one case but not the other (see #3007).

Here is sample output from cprnc for a few cases:

(1) Difference in the presence / absence of time-varying variables:

SUMMARY of cprnc:
 A total number of      6 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
 A total number of      0 fields could not be analyzed
 A total number of      2 time-varying fields on file 1 were not found on file 2.
 A total number of      0 time-constant fields on file 1 were not found on file 2.
 A total number of      0 time-varying fields on file 2 were not found on file 1.
 A total number of      0 time-constant fields on file 2 were not found on file 1.
  diff_test: the two files DIFFER only in their field lists
(2) Difference in the presence / absence of time-constant variables, for
files that have a time dimension:

SUMMARY of cprnc:
 A total number of     13 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
 A total number of      0 fields could not be analyzed
 A total number of      0 time-varying fields on file 1 were not found on file 2.
 A total number of      2 time-constant fields on file 1 were not found on file 2.
 A total number of      0 time-varying fields on file 2 were not found on file 1.
 A total number of      1 time-constant fields on file 2 were not found on file 1.
  diff_test: the two files seem to be IDENTICAL
      (But note that there were differences in field lists just for time-constant fields.)
(3) Difference in the presence / absence of time-constant variables, for
files that do NOT have a time dimension (note that this still results in
a DIFFER result; it seemed to me that that's what a user would want in
this case):

SUMMARY of cprnc:
 A total number of      5 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
 A total number of      0 fields could not be analyzed
 A total number of      2 fields on file 1 were not found on file 2.
 A total number of      1 fields on file 2 were not found on file 1.
  diff_test: the two files DIFFER only in their field lists
The downside of this solution is that we wouldn't catch removals of
time-constant fields when doing baseline comparisons, and might overlook
this for interactive uses of cprnc. To address the issue of baseline
comparisons, I thought about adding a flag that treats time-constant
fieldlist diffs as differences
(#3007 (comment)), but
I have not done this here because this would have taken more time to
implement and added significantly more complexity (in both the cprnc
Fortran and in hist_utils.py) for uncertain benefit. (See the discussion in #3007
for more thoughts on this.)

Once this PR is approved and merged, I will update the cprnc builds on
hobart and cheyenne (currently they are using builds prior to #2988).

Test suite:

Ran the two tests noted as failing in #3007, pointing to a version of
cprnc built from this branch; these tests now pass.
scripts_regression_tests on cheyenne
Test baseline: n/a
Test namelist changes: none
Test status: bit for bit

Fixes #3007

User interface changes?: Changes cprnc behavior slightly, as noted above

Update gh-pages html (Y/N)?: N

Code review: @jedwards4b @jgfouca @fischer-ncar

Also cc @rljacob @gold2718 @mnlevy1981 @ekluzek
jgfouca pushed a commit that referenced this pull request Jun 26, 2019
Update Spectrum MPI module for PGI, XL and GNU.
Update XL version which has compatible parallel-netcdf install.
Update parallel-netcdf module.

More updates:

    Add basic CXX settings to enable PIO2 builds. Fixes #2988
    Backup config for home dir based builds in comments.
    ** Add options for HOME directory based builds in case performance issues are encountered with network file system. This is not enabled by default as space may not be sufficient
    Upgrade PGI to 19.4
    ** Fixes #2980 slow compilation of CAM with PGI on Summit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants