-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report missing variables in cprnc #2988
Report missing variables in cprnc #2988
Conversation
Test suite: None Test baseline: N/A Test namelist changes: N/A Test status: N/A Fixes: None User interface changes?: No Code review: None
Test suite: None Test baseline: N/A Test namelist changes: N/A Test status: N/A Fixes: None User interface changes?: No Code review: None
Test suite: cprnc run_tests Test baseline: N/A Test namelist changes: N/A Test status: N/A Fixes: None User interface changes?: No Code review: None
This will make it easier to add the new feature Test suite: cprnc run tests Test baseline: previous commit Test namelist changes: N/A Test status: all identical Fixes: None User interface changes?: No Code review: None
Test suite: cprnc run_tests Test baseline: previous commit Test namelist changes: N/A Test status: identical except for the missing_variable test Fixes: None User interface changes?: No Code review: None
This method will be easier to extend to variables not found on file1 Test suite: cprnc run_tests Test baseline: previous commit Test namelist changes: N/A Test status: all identical Fixes: None User interface changes?: No Code review: None
For now, this outputs 0 Test suite: cprnc run_tests Test baseline: previous commit Test namelist changes: N/A Test status: identical except for new line Fixes: None User interface changes?: No Code review: None
Test suite: cprnc run_tests Test baseline: previous commit Test namelist changes: N/A Test status: identical except for expected diffs in diffs_in_vals_and_extra_and_missing.nc.out and extra_variable.nc.out Fixes: None User interface changes?: No Code review: None
This better tests the accumulation code Test suite: cprnc run_tests Test baseline: previous commit Test namelist changes: N/A Test status: identical except for expected diffs in two tests Fixes: None User interface changes?: No Code review: None
Test suite: None Test baseline: N/A Test namelist changes: N/A Test status: N/A Fixes: None User interface changes?: No Code review: None
…list_change This brings in some additional cprnc tests These changes are unrelated to the other changes on the branch, but I am including them for convenience, because both involve increasing the test coverage of the cprnc run_tests
Resolved Conflicts: tools/cprnc/README
Make it so differences in field lists will be reported specially in TestStatus.log and TestStatus.
Fail if we can't find one of the expected strings in the cprnc output. Point is: if the cprnc Fortran code is changed to change the format of one of these strings without corresponding changes in the python, we want to fail rather than silently doing the wrong thing. Also refactor this logic a bit.
Some additional notes for reviewers: First: I liberally added a number of reviewers here. If you're not This PR was originally opened nearly 3 years ago here I am not personally attached to the behavior change in this PR: I'm If we do go with this behavior change, I'm not tied to the details of I'm not sure if we should have a new fake test type that triggers this This should be backwards compatible with old versions of cprnc:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the good documentation and testing!
This would often be a useful feature. Most often you'll know about these changes, so can go through and figure it out. |
@billsacks I've removed myself as assignee since I am headed out on travel today. |
Adding @rljacob as a reviewer. He's going to run this by the E3SM integrators to see if they object to this change. (The general feeling on today's cime telecon is that this is a good idea.) |
I'm happy to hear these changes are going in, and I like everything mentioned in the first comment about the PR, but I don't have time to go through the code and offer any specific thoughts so I've removed myself from the reviewer list. Thanks from incorporating these changes, though! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E3SM approves.
cprnc: allow differences in field lists for time-constant fields In cprnc: For files with an unlimited (time) dimension: Separately count (1) missing time-varying fields and (2) missing time-constant fields. Only (1) is considered in determining whether to report a final difference in the field lists. Before this, no distinction was made between time-varying vs. time-constant fields in counting the number of missing variables. (These counts were added in #2988). However, that led to failures in some exact restart tests, because some time-constant fields were on output files from one case but not the other (see #3007). Here is sample output from cprnc for a few cases: (1) Difference in the presence / absence of time-varying variables: SUMMARY of cprnc: A total number of 6 fields were compared of which 0 had non-zero differences and 0 had differences in fill patterns and 0 had different dimension sizes A total number of 0 fields could not be analyzed A total number of 2 time-varying fields on file 1 were not found on file 2. A total number of 0 time-constant fields on file 1 were not found on file 2. A total number of 0 time-varying fields on file 2 were not found on file 1. A total number of 0 time-constant fields on file 2 were not found on file 1. diff_test: the two files DIFFER only in their field lists (2) Difference in the presence / absence of time-constant variables, for files that have a time dimension: SUMMARY of cprnc: A total number of 13 fields were compared of which 0 had non-zero differences and 0 had differences in fill patterns and 0 had different dimension sizes A total number of 0 fields could not be analyzed A total number of 0 time-varying fields on file 1 were not found on file 2. A total number of 2 time-constant fields on file 1 were not found on file 2. A total number of 0 time-varying fields on file 2 were not found on file 1. A total number of 1 time-constant fields on file 2 were not found on file 1. diff_test: the two files seem to be IDENTICAL (But note that there were differences in field lists just for time-constant fields.) (3) Difference in the presence / absence of time-constant variables, for files that do NOT have a time dimension (note that this still results in a DIFFER result; it seemed to me that that's what a user would want in this case): SUMMARY of cprnc: A total number of 5 fields were compared of which 0 had non-zero differences and 0 had differences in fill patterns and 0 had different dimension sizes A total number of 0 fields could not be analyzed A total number of 2 fields on file 1 were not found on file 2. A total number of 1 fields on file 2 were not found on file 1. diff_test: the two files DIFFER only in their field lists The downside of this solution is that we wouldn't catch removals of time-constant fields when doing baseline comparisons, and might overlook this for interactive uses of cprnc. To address the issue of baseline comparisons, I thought about adding a flag that treats time-constant fieldlist diffs as differences (#3007 (comment)), but I have not done this here because this would have taken more time to implement and added significantly more complexity (in both the cprnc Fortran and in hist_utils.py) for uncertain benefit. (See the discussion in #3007 for more thoughts on this.) Once this PR is approved and merged, I will update the cprnc builds on hobart and cheyenne (currently they are using builds prior to #2988). Test suite: Ran the two tests noted as failing in #3007, pointing to a version of cprnc built from this branch; these tests now pass. scripts_regression_tests on cheyenne Test baseline: n/a Test namelist changes: none Test status: bit for bit Fixes #3007 User interface changes?: Changes cprnc behavior slightly, as noted above Update gh-pages html (Y/N)?: N Code review: @jedwards4b @jgfouca @fischer-ncar Also cc @rljacob @gold2718 @mnlevy1981 @ekluzek
Update Spectrum MPI module for PGI, XL and GNU. Update XL version which has compatible parallel-netcdf install. Update parallel-netcdf module. More updates: Add basic CXX settings to enable PIO2 builds. Fixes #2988 Backup config for home dir based builds in comments. ** Add options for HOME directory based builds in case performance issues are encountered with network file system. This is not enabled by default as space may not be sufficient Upgrade PGI to 19.4 ** Fixes #2980 slow compilation of CAM with PGI on Summit.
This PR changes cprnc and the test system so that, if variables are
missing from one of the files (the baseline file or the new file), this
will be reported as a baseline failure, like this:
For cprnc itself, this changes the final status of cprnc so that, if
variables are missing from one of the files, the final status is not
"IDENTICAL": this final status will now look like:
In addition, the number of variables present in file2 but not file1 is
now reported in the final counts.
If there are both missing variables and differences in common variables,
cprnc's final result is the same as before - reporting a DIFF in this
case. i.e., this new result is only reported as the final status if all
common variables are identical: a real DIFF takes precedence over a
field list difference.
This PR also changes the python scripts so that they respond
appropriately to this new possible cprnc result: During baseline
comparisons, if the only differences are in field lists (no differences
in the actual values of variables), then the baseline phase will still
report a FAIL, but a comment will be added with a keyword, FIELDLIST and
a bit of description (similarly to what we do for BFAILs), facilitating
filtering out these FIELDLIST differences if they were expected. As with
the behavior of cprnc itself, FIELDLIST is only reported if the only
differences are in field lists: if there are any real differences, then
the output is the same as before (so that you won't miss any real
differences if you do something like
grep -v FIELDLIST
). (If there areboth FIELDLIST and BFAIL results for a given test, but no actual
differences in values, then we report MULTIPLE, so that neither
grep -v FIELDLIST
norgrep -v BFAIL
would accidentally exclude the result.)Test suite:
scripts_regression_tests
on cheyenne with both an old and new buildof cprnc (ran on 63377a8)
and a test with both fieldlist and real diffs: checked TestStatus.log
and TestStatus in these cases
comparisons against master version): ensured that the only diffs are
as expected, and examined output of the new tests there (side-note
about the new attribute-comparison test: attribute comparisons are
currently asymmetrical: cprnc currently doesn't detect attributes
present on file2 not on file1)
Test baseline: For cprnc's run_tests: 5fa1832
Test namelist changes: none
Test status: bit for bit
Fixes #870
User interface changes?: none
Update gh-pages html (Y/N)?: N
Code review: