Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upcoming sort-of-non-zero-diff MAPL Update #510

Closed
mathomp4 opened this issue Jan 7, 2022 · 2 comments
Closed

Upcoming sort-of-non-zero-diff MAPL Update #510

mathomp4 opened this issue Jan 7, 2022 · 2 comments
Labels
output question Further information is requested

Comments

@mathomp4
Copy link
Member

mathomp4 commented Jan 7, 2022

This is part a "be aware" issue for @gmao-rreichle as well as part a "anything we might do for you" question.

Namely, recently, it was discovered by Ops that the MAPL 2 History output was missing metadata that had long been included in GEOS ADAS/GCM output. This metadata was actually specified in the GEOS FP 1.2 File Spec. Moreover, the File Spec stated that lat and lon in the files were 64-bit/double. So we made a MAPL 2.8.0.1 for them to test where @bena-nasa restored all the metadata as before. These same changes were then brought into MAPL mainline development as in MAPL 2.15.1 (and then develop) which will eventually filter down to the GEOSldas the next time you update.

In the end, this means that if you compare GEOS History output before and after this change, it can be considered non-zero-diff by some comparators. cdo diffn returns that the files are identical because it only compares the field data, and the fields (and state) did not change. However, nccmp as is used in my nightly testing scripts, etc. will show a difference because I tend to compare everything. So you get things like:

Comparing geosgcm_buda.20000414_2230z...
Failure!
Checking for data differences
Variable Group Count Sum      AbsSum          Min         Max       Range Mean      StdDev
lon      /        24   0 1.76748e-13 -1.42109e-14 1.42109e-14 2.84217e-14    0 9.27193e-15
lat      /        16   0 6.30607e-14 -7.10543e-15 7.10543e-15 1.42109e-14    0 4.91048e-15
Checking for metadata differences
DIFFER : VARIABLE : lon : TYPE : FLOAT <> DOUBLE
DIFFER : VARIABLE : lat : TYPE : FLOAT <> DOUBLE
DIFFER : VARIABLE "DMDTANA" IS MISSING ATTRIBUTE WITH NAME "add_offset" IN FILE "stock-gcm-2022Jan03-1day-c24//scratch/stock-gcm-2022Jan03-1day-c24.geosgcm_buda.20000414_2230z.nc4"
DIFFER : VARIABLE "DMDTANA" IS MISSING ATTRIBUTE WITH NAME "fmissing_value" IN FILE "stock-gcm-2022Jan03-1day-c24//scratch/stock-gcm-2022Jan03-1day-c24.geosgcm_buda.20000414_2230z.nc4"
DIFFER : VARIABLE "DMDTANA" IS MISSING ATTRIBUTE WITH NAME "scale_factor" IN FILE "stock-gcm-2022Jan03-1day-c24//scratch/stock-gcm-2022Jan03-1day-c24.geosgcm_buda.20000414_2230z.nc4"
DIFFER : VARIABLE "DMDTANA" IS MISSING ATTRIBUTE WITH NAME "standard_name" IN FILE "stock-gcm-2022Jan03-1day-c24//scratch/stock-gcm-2022Jan03-1day-c24.geosgcm_buda.20000414_2230z.nc4"
DIFFER : VARIABLE "DMDTANA" IS MISSING ATTRIBUTE WITH NAME "vmax" IN FILE "stock-gcm-2022Jan03-1day-c24//scratch/stock-gcm-2022Jan03-1day-c24.geosgcm_buda.20000414_2230z.nc4"
DIFFER : VARIABLE "DMDTANA" IS MISSING ATTRIBUTE WITH NAME "vmin" IN FILE "stock-gcm-2022Jan03-1day-c24//scratch/stock-gcm-2022Jan03-1day-c24.geosgcm_buda.20000414_2230z.nc4"
DIFFER : NUMBER OF ATTRIBUTES : VARIABLE : DMDTANA : 5 <> 11
...

nccmp considers the coordinate variables as data, so it's showing the fact that since lat and lon are now doubles, they are non-zero-diff to the former 32-bit/float lats and lons. And then you get all the extra metadata on each variable. (And as expected, all my GCM nightly tests last night threw comparison failures.)

Plus, each collection now has expanded (compared to before) global metadata:

// global attributes:
		:Comment = "NetCDF-4" ;
		:Contact = "http://gmao.gsfc.nasa.gov" ;
		:Convention = "CF" ;
		:Filename = "geosgcm_prog" ;
		:History = "File written by MAPL_PFIO" ;
		:Institution = "NASA Global Modeling and Assimilation Office" ;
		:References = "see MAPL documentation" ;
		:Source = "unknown" ;
		:Title = "mapl12151-gcm-2022Jan06-1day-c24-Institution" ;
		:_NCProperties = "version=2,netcdf=4.8.1,hdf5=1.10.7" ;
		:_SuperblockVersion = 2 ;
		:_IsNetcdf4 = 1 ;
		:_Format = "netCDF-4" ;

So here is the question part. We are planning to make Institution and Contact be user-settable since people other than GEOS are using MAPL now (not sure if they use it for output but just in case...). I was wondering: is there anything else here you would like to be user-settable? Obviously, you can change/add to this with ncatted or other utilities, but I need to start realizing that GEOS does not always mean ADAS, but also LDAS and ODAS.

(cc: @bena-nasa @tclune)

@mathomp4 mathomp4 added question Further information is requested output labels Jan 7, 2022
@gmao-rreichle
Copy link
Contributor

Hi @mathomp4, (cc: @jardizzo , @biljanaorescanin )
Many thanks for the heads-up and detailed description. Very helpful.
We'll keep in mind the potential test comparison "failures".
Currently, GEOSldas is used for SMAP L4_SM ops, but the MAPL HISTORY output is post-processed into the hdf5 format required by SMAP mission, which involves a lot of custom metadata insertion. That is, SMAP users don't ever see the HISTORY nc4 files or their metadata.
Going forward, GEOSldas may be creating files for a public data product (FP or MERRA-3) when GEOSldas is integrated into the ADAS to supplement the atmospheric analysis with a land analysis. In this case, we'd be using the same global attributes as the ADAS. I don't expect needing anything else in terms of metadata.
We tend to ignore metadata output in GEOSldas science experiments.
In short, all good as far as I can tell. No immediate action needed. Thanks again!
(I'm leaving the issue open for now. Feel free to close.)

@mathomp4
Copy link
Member Author

mathomp4 commented Jan 7, 2022

I'll close it since you're aware (and it keeps things clean). Thanks!

@mathomp4 mathomp4 closed this as completed Jan 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
output question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants