Skip to content

Commit

Permalink
Merge pull request #3051 from billsacks/cprnc_allow_timeconst_fielddiffs
Browse files Browse the repository at this point in the history
cprnc: allow differences in field lists for time-constant fields
In cprnc: For files with an unlimited (time) dimension: Separately count
(1) missing time-varying fields and (2) missing time-constant
fields. Only (1) is considered in determining whether to report a final
difference in the field lists.

Before this, no distinction was made between time-varying
vs. time-constant fields in counting the number of missing
variables. (These counts were added in #2988). However, that led to
failures in some exact restart tests, because some time-constant fields
were on output files from one case but not the other (see #3007).

Here is sample output from cprnc for a few cases:

(1) Difference in the presence / absence of time-varying variables:

SUMMARY of cprnc:
 A total number of      6 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
 A total number of      0 fields could not be analyzed
 A total number of      2 time-varying fields on file 1 were not found on file 2.
 A total number of      0 time-constant fields on file 1 were not found on file 2.
 A total number of      0 time-varying fields on file 2 were not found on file 1.
 A total number of      0 time-constant fields on file 2 were not found on file 1.
  diff_test: the two files DIFFER only in their field lists
(2) Difference in the presence / absence of time-constant variables, for
files that have a time dimension:

SUMMARY of cprnc:
 A total number of     13 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
 A total number of      0 fields could not be analyzed
 A total number of      0 time-varying fields on file 1 were not found on file 2.
 A total number of      2 time-constant fields on file 1 were not found on file 2.
 A total number of      0 time-varying fields on file 2 were not found on file 1.
 A total number of      1 time-constant fields on file 2 were not found on file 1.
  diff_test: the two files seem to be IDENTICAL
      (But note that there were differences in field lists just for time-constant fields.)
(3) Difference in the presence / absence of time-constant variables, for
files that do NOT have a time dimension (note that this still results in
a DIFFER result; it seemed to me that that's what a user would want in
this case):

SUMMARY of cprnc:
 A total number of      5 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
 A total number of      0 fields could not be analyzed
 A total number of      2 fields on file 1 were not found on file 2.
 A total number of      1 fields on file 2 were not found on file 1.
  diff_test: the two files DIFFER only in their field lists
The downside of this solution is that we wouldn't catch removals of
time-constant fields when doing baseline comparisons, and might overlook
this for interactive uses of cprnc. To address the issue of baseline
comparisons, I thought about adding a flag that treats time-constant
fieldlist diffs as differences
(#3007 (comment)), but
I have not done this here because this would have taken more time to
implement and added significantly more complexity (in both the cprnc
Fortran and in hist_utils.py) for uncertain benefit. (See the discussion in #3007
for more thoughts on this.)

Once this PR is approved and merged, I will update the cprnc builds on
hobart and cheyenne (currently they are using builds prior to #2988).

Test suite:

Ran the two tests noted as failing in #3007, pointing to a version of
cprnc built from this branch; these tests now pass.
scripts_regression_tests on cheyenne
Test baseline: n/a
Test namelist changes: none
Test status: bit for bit

Fixes #3007

User interface changes?: Changes cprnc behavior slightly, as noted above

Update gh-pages html (Y/N)?: N

Code review: @jedwards4b @jgfouca @fischer-ncar

Also cc @rljacob @gold2718 @mnlevy1981 @ekluzek
  • Loading branch information
jedwards4b authored Mar 28, 2019
2 parents 338c143 + 37cdf66 commit 95e117c
Show file tree
Hide file tree
Showing 8 changed files with 202 additions and 43 deletions.
20 changes: 16 additions & 4 deletions tools/cprnc/README
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,10 @@ SUMMARY of cprnc:
and 17 had differences in fill patterns
and 2 had differences in dimension sizes
A total number of 10 fields could not be analyzed
A total number of 0 fields on file 1 were not found on file2
A total number of 0 fields on file 2 were not found on file1
A total number of 0 time-varying fields on file 1 were not found on file 2.
A total number of 0 time-constant fields on file 1 were not found on file 2.
A total number of 0 time-varying fields on file 2 were not found on file 1.
A total number of 0 time-constant fields on file 2 were not found on file 1.
diff_test: the two files seem to be DIFFERENT


Expand All @@ -150,14 +152,24 @@ This summarizes:
- the number of fields with differences in fill patterns
- the number of fields with differences in dimension sizes
- the number of fields that could not be analyzed
- the number of fields that could not be found on the second file
- the number of fields that could not be found on the first file
- the number of fields on one file but not the other
- for files with an unlimited (time) dimension, these counts are
broken down into time-varying fields (i.e., fields with an unlimited
dimension) and time-constant fields (i.e., fields without an
unlimited dimension)
- whether the files are IDENTICAL, DIFFERENT, or DIFFER only in their field lists
- Files are considered DIFFERENT if there are differences in the values, fill
patterns or dimension sizes of any variable
- Files are considered to "DIFFER only in their field lists" if matching
variables are all identical, but there are either fields on file1 that are
not on file2, or fields on file2 that are not on file1
- However, if the only difference in field lists is in the presence
or absence of time-constant fields on a file that has an unlimited
(time) dimension, the files are considered to be IDENTICAL, with
an extra message appended that notes this fact. (While not ideal,
this exception is needed so that exact restart tests pass despite
some time-constant fields being on the output files from one case
but not the other.)

Developers Guide:
-----------------
Expand Down
41 changes: 14 additions & 27 deletions tools/cprnc/compare_vars_mod.F90.in
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module compare_vars_mod
use filestruct, only : file_t, var_t, vdimsize, dim_t, verbose
use filestruct, only : file_t, var_t, is_time_varying, vdimsize, dim_t, verbose
use prec, only : r4, r8, i4
use netcdf, only : nf90_char, nf90_int, nf90_double, nf90_float, nf90_get_var, nf90_max_dims, &
nf90_inq_varid, nf90_get_att, nf90_noerr
Expand Down Expand Up @@ -40,14 +40,18 @@ contains
integer, allocatable :: nsteph(:)
character(len=132) :: dimstr
type(dim_t), pointer :: udim
logical :: file_has_unlimited_dim
real(r8), parameter :: timeepsilon = 1.e-9 ! time diff less than this considered insignificant


vtotal = 0
vsizes_differ = 0
vnot_analyzed = 0
if(n==2 .and. .not.ignoretime) then
! NOTE(wjs, 2019-03-21) Most of the cprnc code allows the unlimited dimension to be
! named anything - not necessarily 'time'. But this block of code assumes that the
! unlimited dimension is named 'time' in order to find the associated coordinate
! variable. We should probably generalize this by looking for a variable with the
! same name as the unlimited dimension.
call checknf90(nf90_inq_varid(file(1)%fh, 'time', vid1), &
err_str='These files don''t have a time dimension, use cprnc with -m')

Expand All @@ -70,17 +74,15 @@ contains
end if

nvars = size(file(1)%var)
if (file(1)%unlimdimid == -1) then
file_has_unlimited_dim = .false.
if (file(1)%has_unlimited_dim()) then
udim => file(1)%dim(file(1)%unlimdimid)
else
if (.not. ignoretime) then
write(6,*) 'ERROR: For files without an unlimited dimension,'
write(6,*) 'ignore_time needs to be true (via setting the -m flag to cprnc)'
stop
end if
else
file_has_unlimited_dim = .true.
udim => file(1)%dim(file(1)%unlimdimid)
endif
end if

ndiffs = 0
nfilldiffs = 0
Expand All @@ -89,7 +91,7 @@ contains
do i=1,nvars
v1 => file(1)%var(i)

if (.not. is_time_varying(v1, file_has_unlimited_dim, file(1)%unlimdimid)) then
if (.not. is_time_varying(v1, file(1)%has_unlimited_dim(), file(1)%unlimdimid)) then
call get_dimname_str(v1%ndims,v1%dimids,file(1)%dim,dimstr)
write(6,140) trim(v1%name),trim(dimstr)
vtotal = vtotal+1
Expand All @@ -103,7 +105,7 @@ contains
end do

! Now look at variables that DO have unlimdim
if (file_has_unlimited_dim) then
if (file(1)%has_unlimited_dim()) then

ierr = nf90_inq_varid(file(1)%fh, 'nsteph', vidnsteph)
if(ierr == NF90_NOERR) then
Expand Down Expand Up @@ -146,7 +148,7 @@ contains

do i=1,nvars
v1 => file(1)%var(i)
if (is_time_varying(v1, file_has_unlimited_dim, file(1)%unlimdimid)) then
if (is_time_varying(v1, file(1)%has_unlimited_dim(), file(1)%unlimdimid)) then
call get_dimname_str(v1%ndims,v1%dimids,file(1)%dim,dimstr)
vtotal = vtotal+1
write(6,145) trim(v1%name),trim(dimstr), t1, t2
Expand All @@ -160,7 +162,7 @@ contains
end if
end do
end do
end if ! if (file_has_unlimited_dim)
end if ! if (file(1)%has_unlimited_dim())

140 format(1x,a,3x,a)
145 format(1x,a,3x,a,' t_index = ',2i6)
Expand Down Expand Up @@ -606,20 +608,5 @@ contains
end function translate_loc


function is_time_varying(var, file_has_unlimited_dim, unlimdimid)
type(var_t), intent(in) :: var ! variable of interest
logical , intent(in) :: file_has_unlimited_dim ! true if the file has an unlimited dimension
integer , intent(in) :: unlimdimid ! the file's unlimited dim id (if it has one)

logical :: is_time_varying ! true if the given variable is time-varying

if (file_has_unlimited_dim) then
is_time_varying = any(var%dimids == unlimdimid)
else
is_time_varying = .false.
end if
end function is_time_varying



end module compare_vars_mod
98 changes: 95 additions & 3 deletions tools/cprnc/cprnc.F90
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,18 @@ program piocprnc
type(dim_t) :: dimoptions(12)
integer :: dimoptioncnt
integer :: nvars, ndiffs, nfilldiffs

! The following variables count the number of fields found on one file but not the
! other, only considering (a) fields with an unlimited (time) dimension, and (b) fields
! without an unlimited (time) dimension on a file that doesn't have an unlimited
! dimension.
integer :: num_not_found_on_file1, num_not_found_on_file2

! The following variables count the number of fields found on one file but not the
! other, only considering fields without an unlimited (time) dimension on a file that
! has an unlimited dimension.
integer :: num_not_found_on_file1_timeconst, num_not_found_on_file2_timeconst

integer :: num_sizes_differ
integer :: num_not_analyzed

Expand Down Expand Up @@ -90,8 +101,13 @@ program piocprnc

num_not_found_on_file1 = 0
num_not_found_on_file2 = 0
num_not_found_on_file1_timeconst = 0
num_not_found_on_file2_timeconst = 0
call match_vars( file(1), file(2), &
num_not_found_on_file1, num_not_found_on_file2)
num_not_found_on_file1 = num_not_found_on_file1, &
num_not_found_on_file2 = num_not_found_on_file2, &
num_not_found_on_file1_timeconst = num_not_found_on_file1_timeconst, &
num_not_found_on_file2_timeconst = num_not_found_on_file2_timeconst)
end if
call compare_vars(numcases, file, nvars, ndiffs, nfilldiffs, &
num_sizes_differ, num_not_analyzed)
Expand All @@ -114,15 +130,44 @@ program piocprnc
write(6,700) ' and ',num_sizes_differ,' had different dimension sizes'
write(6,700) ' A total number of ',num_sizes_differ + num_not_analyzed, &
' fields could not be analyzed'
write(6,700) ' A total number of ',num_not_found_on_file2,' fields on file 1 were not found on file 2.'
write(6,700) ' A total number of ',num_not_found_on_file1,' fields on file 2 were not found on file 1.'

call print_fields_not_found( &
filenum = 1, &
file_has_unlimited_dim = file(1)%has_unlimited_dim(), &
num_not_found = num_not_found_on_file2, &
num_not_found_timeconst = num_not_found_on_file2_timeconst)

call print_fields_not_found( &
filenum = 2, &
file_has_unlimited_dim = file(2)%has_unlimited_dim(), &
num_not_found = num_not_found_on_file1, &
num_not_found_timeconst = num_not_found_on_file1_timeconst)

if (nvars == 0 .or. ndiffs > 0 .or. nfilldiffs > 0 .or. &
num_sizes_differ > 0 .or. num_not_analyzed >= nvars) then
write(6,700) ' diff_test: the two files seem to be DIFFERENT '
else if (num_not_found_on_file1 > 0 .or. num_not_found_on_file2 > 0) then
! Note that we deliberately allow num_not_found_on_file1_timeconst or
! num_not_found_on_file2_timeconst to be > 0: those do NOT result in a
! "DIFFER" result.
!
! Ideally, we'd count those fields here, too. Doing so would catch more
! differences and would simplify the cprnc code. But this sometimes leads to
! problems when comparing restart vs. baseline files
! (https://github.com/ESMCI/cime/issues/3007). We could add a flag that you
! specify to not count these fields, but there are backwards compatibility
! issues with doing so. Eventually it could be good to count these absent
! fields as a DIFFER result by default, adding a flag that you can specify to
! not count them, then have cime specify this flag when doing the in-test
! comparison (so absent time-constant fields would result in a DIFFER result
! for cime's baseline comparisons and for interactive use of cprnc).
write(6,'(a)') ' diff_test: the two files DIFFER only in their field lists'
else
write(6,700) ' diff_test: the two files seem to be IDENTICAL '
if (num_not_found_on_file1_timeconst > 0 .or. &
num_not_found_on_file2_timeconst > 0) then
write(6,'(a)') ' (But note that there were differences in field lists just for time-constant fields.)'
end if
end if
end if
write(6,*) ' '
Expand Down Expand Up @@ -213,4 +258,51 @@ subroutine parsearg (arg, dimname, v1, v2)
return
end subroutine parsearg

subroutine print_fields_not_found(filenum, file_has_unlimited_dim, &
num_not_found, num_not_found_timeconst)
! Prints information about the number of fields in filenum not found on the other file

integer, intent(in) :: filenum ! file number for which we're printing this information
logical, intent(in) :: file_has_unlimited_dim ! whether this file has an unlimited dimension

! Number of fields in filenum but not on the other file, only considering (a) fields
! with an unlimited (time) dimension, and (b) fields without an unlimited (time)
! dimension on a file that doesn't have an unlimited dimension
integer, intent(in) :: num_not_found

! Number of fields in filenum but not on the other file, only considering fields
! without an unlimited (time) dimension on a file that has an unlimited dimension
integer, intent(in) :: num_not_found_timeconst

integer :: other_filenum

if (filenum == 1) then
other_filenum = 2
else if (filenum == 2) then
other_filenum = 1
else
stop 'Unexpected value for filenum'
end if

if (file_has_unlimited_dim) then
write(6,'(a,i6,a,i1,a,i1,a)') &
' A total number of ', num_not_found, &
' time-varying fields on file ', filenum, &
' were not found on file ', other_filenum, '.'
write(6,'(a,i6,a,i1,a,i1,a)') &
' A total number of ', num_not_found_timeconst, &
' time-constant fields on file ', filenum, &
' were not found on file ', other_filenum, '.'
else
write(6,'(a,i6,a,i1,a,i1,a)') &
' A total number of ', num_not_found, &
' fields on file ', filenum, &
' were not found on file ', other_filenum, '.'
if (num_not_found_timeconst > 0) then
stop 'Programming error: file has no unlimited dimension, but num_not_found_timeconst > 0'
end if
end if

end subroutine print_fields_not_found

end program piocprnc
63 changes: 58 additions & 5 deletions tools/cprnc/filestruct.F90
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,24 @@ module filestruct
type(dim_t), pointer :: dim(:)
type(var_t), pointer :: var(:)
integer :: unlimdimid
contains
procedure :: has_unlimited_dim ! logical function; returns true if this file has an unlimited dimension
end type file_t

logical :: verbose

contains
logical function has_unlimited_dim(file)
! Returns true if this file has an unlimited dimension
class(file_t), intent(in) :: file

if (file%unlimdimid == -1) then
has_unlimited_dim = .false.
else
has_unlimited_dim = .true.
end if
end function has_unlimited_dim

subroutine init_file_struct( file, dimoptions )

type(file_t) :: file
Expand Down Expand Up @@ -254,15 +267,30 @@ end subroutine compare_dimensions


subroutine match_vars( file1, file2, &
num_not_found_on_file1, num_not_found_on_file2 )
num_not_found_on_file1, num_not_found_on_file2, &
num_not_found_on_file1_timeconst, num_not_found_on_file2_timeconst)
type(file_t), intent(inout) :: file1, file2

! Accumulates count of variables on file2 not found on file1
! Accumulates count of variables on file2 not found on file1; this only considers (a)
! fields with an unlimited (time) dimension, and (b) fields without an unlimited
! (time) dimension on a file that doesn't have an unlimited dimension.
integer, intent(inout) :: num_not_found_on_file1

! Accumulates count of variables on file1 not found on file2
! Accumulates count of variables on file1 not found on file2; this only considers (a)
! fields with an unlimited (time) dimension, and (b) fields without an unlimited
! (time) dimension on a file that doesn't have an unlimited dimension.
integer, intent(inout) :: num_not_found_on_file2

! Accumulates count of variables on file2 not found on file1; this only considers
! fields without an unlimited (time) dimension on a file that has an unlimited
! dimension.
integer, intent(inout) :: num_not_found_on_file1_timeconst

! Accumulates count of variables on file1 not found on file2; this only considers
! fields without an unlimited (time) dimension on a file that has an unlimited
! dimension.
integer, intent(inout) :: num_not_found_on_file2_timeconst

type(var_t), pointer :: varfile1(:),varfile2(:)

integer :: vs1, vs2, i, j
Expand All @@ -286,18 +314,43 @@ subroutine match_vars( file1, file2, &
do i=1,vs1
if(varfile1(i)%matchid<0) then
print *, 'Could not find match for file1 variable ',trim(varfile1(i)%name), ' in file2'
num_not_found_on_file2 = num_not_found_on_file2 + 1
if (file1%has_unlimited_dim() .and. &
.not. is_time_varying(varfile1(i), file1%has_unlimited_dim(), file1%unlimdimid)) then
num_not_found_on_file2_timeconst = num_not_found_on_file2_timeconst + 1
else
num_not_found_on_file2 = num_not_found_on_file2 + 1
end if
end if
end do
do i=1,vs2
if(varfile2(i)%matchid<0) then
print *, 'Could not find match for file2 variable ',trim(varfile2(i)%name), ' in file1'
num_not_found_on_file1 = num_not_found_on_file1 + 1
if (file2%has_unlimited_dim() .and. &
.not. is_time_varying(varfile2(i), file2%has_unlimited_dim(), file2%unlimdimid)) then
num_not_found_on_file1_timeconst = num_not_found_on_file1_timeconst + 1
else
num_not_found_on_file1 = num_not_found_on_file1 + 1
end if
end if
end do
end subroutine match_vars


function is_time_varying(var, file_has_unlimited_dim, unlimdimid)
type(var_t), intent(in) :: var ! variable of interest
logical , intent(in) :: file_has_unlimited_dim ! true if the file has an unlimited dimension
integer , intent(in) :: unlimdimid ! the file's unlimited dim id (if it has one)

logical :: is_time_varying ! true if the given variable is time-varying

if (file_has_unlimited_dim) then
is_time_varying = any(var%dimids == unlimdimid)
else
is_time_varying = .false.
end if
end function is_time_varying


function vdimsize(dims, dimids)
type(dim_t), intent(in) :: dims(:)
integer, intent(in) :: dimids(:)
Expand Down
3 changes: 3 additions & 0 deletions tools/cprnc/run_tests
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,12 @@ my %tests = ('copy.nc' => {control => 'control.nc'},
'vals_differ_by_varying_amounts2.nc' => {control => 'control.nc'},

'multipleTimes_someTimeless_diffs_in_vals_and_fill.nc' => {control => 'control_multipleTimes_someTimeless.nc'},
'multipleTimes_someTimeless_extra_and_missing.nc' => {control => 'control_multipleTimes_someTimeless.nc'},

'noTime_diffs_in_vals_and_fill.nc' => {control => 'control_noTime.nc',
extra_args => '-m'},
'noTime_extra_and_missing.nc' => {control => 'control_noTime.nc',
extra_args => '-m'},

'diffs_0d.nc' => {control => 'control_0d.nc',
extra_args => '-m'},
Expand Down
Loading

0 comments on commit 95e117c

Please sign in to comment.