Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New interface to the Aether lat-lon model #635

Merged
merged 86 commits into from
Mar 12, 2024
Merged

New interface to the Aether lat-lon model #635

merged 86 commits into from
Mar 12, 2024

Conversation

kdraeder
Copy link
Contributor

@kdraeder kdraeder commented Feb 9, 2024

Description:

Add a new interface to enable assimilation with the lat-lon formulation of the Aether space weather model.

Fixes issue

Fixes #559

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Documentation changes needed?

  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
    • documentation external to the interface has been updated

Tests

We compiled and built state conversion programs aether_to_dart and dart_to_aether
and tested them on the latest Aether restart file sets.
Model_mod_check passes tests 1-5 and 7, except for expected failure of interpolation at the poles.
Filter is able to assimilate several observations of AIRS temperature
and GPS profiles of electron density (even though that's not available in the restart files).

Checklist for merging

  • Confirm that we will not develop cycling scripting at this time (or develop it).
  • Updated changelog entry
  • Documentation updated
  • Update conf.py

Checklist for release

  • Merge into main
  • Create release from the main branch with appropriate tag
  • Delete feature-branch

Testing Datasets

  • Dataset needed for testing available upon request
  • Dataset download instructions included
  • No dataset needed

kdraeder and others added 30 commits October 12, 2023 15:41
…. This is the 0th iteration of development, to make a common area for all developers.
These subroutines are for reading restart block (subdomain) files.
The entry point to them is through program aether_to_dart.f90,
which calls model_mod.f90:restart_files_to_netcdf.

Also updated aether_to_dart.f90 and aether_to_dart.nml
to replace gitm with aether.

I copied the tiegcm/work directory in order to compile.
It may have domain-related files which should be removed.

Among the many things that need to be fixed:
- the model_mod headers have not been merged, so it doesn't compile
- global variable definitions
- global variable allocations and initializations
- removing extra 'domain's used in tiegcm
- replacing variable names having TIEGCM and GITM (case independent)
  with Aether.

All of this code is untested.
The (gitm) code that reads the block(subdomain) files can
continue to use the (x,y,z) dimensions.  That will differentiate it
from the code that creates the whole domain filter_input.nc,
which will use more conventional dimensions (lon,lat,alt).
Aether uses altitude instead of a pressure-type vertical coordinate,
but parts of model_mod will need pressures at these altitudes.

This commit focused on updating the dimensions used to define
the filter_input.nc and some aspects of reading the block restart files.
It does not:
   read in restart data (field names and source files are undecided).
   write whole domain data to a filter_input.nc
   read a filter_output.nc and divide it into block files
   make a running aether_to_dart program

Modified:
aether_to_dart.f90
   Dimensions {lons,lats,alts} for whole domain, {x,y,z} for blocks.
   Initializes using static_init_blocks, not static_init_model.
aether_to_dart.nml
   Updated from tiegcm to aether names.
model_mod.f90
   Commented out many subroutines that interfered with compiling
     just aether_to_mod.  They may be needed later.
   Added restart file dimension variables to global storage.
   Translate time.json times into gregorian calendar.
   Calculates {nlon,nlat,nalt} from dimensions of block restart files.

temporarily renamed in order to compile just aether_to_dart:
   dart_to_aether.f90
   dart_aether_mod.f90
Adapted code to handle the restart files' fields having C ordering.
  The filter_input.nc will use the same order, to avoid transposes.
Replaced gitm's field-by-field code with loops over fields
  read from model_mod_nml.
Made code capable of handling subdomains in the vertical, without halos.
Moved definition of filter_input.nc fields from the unpack_data
  up to the fields loop in read_data_from_block, where it still doesn't
  fit with the subroutine name.
Made code write whole domain data to a filter_input.nc.

Aether_to_dart compiles, runs, and the filter_input.nc appears to have
  the blocks combined in the right order.

This commit does not:
  include dart_to_aether code to read a filter_output.nc
    and divide it into block files
  have the extraneous code and comments taken out.
  get grid info from model_mod_nml instead of the UAM.in text file.
  allow interspersing of neutral and ion fields in the model_mod_nml
    variable list.
  implement handling of e- or f10_7

Modified:
aether_to_dart.f90
  Read member number from standard input (script) and pass to
    restart_files_to_netcdf.
aether_to_dart.nml
  Added full pathname to aether_restart_input_dirname
model_mod.f90
   See above
   Read time from a restart file and assimilation_period from model_mod_nml
     instead of from time.json.
   Adapt "domains" feature to specify the source file of each of the fields,
     in model_mod_nml.

   Change name of read_data_from_block to reflect definition of output
     and integration of blocks?
model_mod.f90
   Replaced gitm's plain fields list with variable_table contents.
   Specified the source file_root of each field; 'neutrals' or 'ions'.
   Replace file name variables with filter_inout_dir.
   Added calendar and assimilation_period.
Removed
  ilev code; no fields defined there or needed for interpolation.
  domains code
  ZG code

Straightened out reading times:
  Moved definition of reference day and time to static_init_blocks and
    static_init_model
  Removed existing read_model_time and changed get_state_time to
    read_model_time.
  Extracted the aether file name creation from open_block_file
    into function block_file_name.  Then block_file_name can be called
    before the read_ routines, which want a filename, not a directory.

Modified:
model_mod.f90
  Extracted the restart file name creation from open_block_file
    into function block_file_name.
  Made aether_ref* global variables.
  Removed more domain code.
  Renamed verify_variables to make_variable_table,
    because it doesn't verify.

The output filter_input.nc Temperature matches
  the 4 neutrals restart files and O+ matches the 4 ions restarts.
  (mac:~/DAI/Aether/Reordered_dims/compare*png)

Todo:
  Can static_init_{model,blocks} be combined?
    Or both call a common subroutine?
  Remove filter_io_filename from get_grid_from_netcdf arg list? (it's global)
  Convert some prints to error_handler.  Delete the rest.
  Write dart_to_aether
  Merge the *_to_* code into Ben's model_mod.
This should be done before dart_to_aether and merging.
No functional changes, in order to make a clean transition.
Diffuse won't be able to easily show functional changes
from the previous commit.

Moved private routines to the private section,
public to public, and a new initialization section.
Removed unused sortindexlist.

Tested.
aether_to_dart and dart_to_aether need to be in the
model_serial_programs list, not serial_programs.
Otherwise, `quickbuild aether_to_dart` finds dart_to_aether.f90
and processes it like a module.
Changed aether_to_dart routine names to describe what they do.
That clarifies how to write the dart_to_aether routines.
>>> It turns out that the halos at the poles are not copies
    of points from over the pole.  There's an interpolation,
    which may need happen in dart_to_aether.

Complete update of dart_to_aether.f90 from GITM version to Aether.
Inverted the ivar and {ib,jb} loops:
  + read in a whole domain of a field from filter_output.nc,
  + add a halo around the whole thing,
    which contains updated state variables,
  + loop over the blocks to extract the data+halo for each
    and write to the block restart file.
Renamed some routines to better describe what they do.

It compiles and runs correctly (caveat the halo interpolation issue).

If this is too much opening and closing restart files,
  open all block restart files at the start,
  and close after all vars are written.

To be done, especially after getting restart file changes from Aaron:
  - Update dimension orders in arrays and NetCDF calls.
  - Update field names.
  - Add sections to handle TEC and f10_7.
  - Replace add_halo_fulldom3d poles sections with interpolations.
  - Remove remaining commented GITM code.
It appears that Aether restart files will continue to be not CF-compliant.
This commit adds functions to convert the Aether field names into:
a) something that the nf90 routines can find in the restart files,
b) something CF-compliant for writing to the filter_input.nc files.
E.g. 'Perp.\ Ion\ Velocity\ \(Vertical\)\ \(O+\)' ->
  a) 'Perp. Ion Velocity (Vertical) (O+)' ->
  b) 'Opos_Perp_Ion_Velocity_Vertical'
Moving the species from the end to the start will make output from
some NetCDF processing tools more coherent by grouping all fields
related to a species together, instead of all "Perp."... fields.

The strategy is to fill aether_to_dart_nml and dart_to_aether_nml
"variables" with Aether names, and model_nml "variables" with the
corresponding CF-compliant names.  It may be necessary for a user
to run aether_to_dart independently of an assimilation to generate
the CF names.
_to_ programs no longer read model_mod_nml.

aether_to_dart.f90
   Moved the namelist read from here into model_mod:static_init_block

aether_to_dart_nml, dart_to_aether_nml
   Added 'variables'

model_mod.f90:
   aeth_name_to_dart
     Construct a CF-compliant field name from strings derived from
       the Aether field name.
   purge_chars
     Remove unwanted characters from the input string.
     Optionally replace '+' and '-' with 'pos' and 'neg'
   obstypelength, from types_mod, is too small for Aether names
   Removed global variables from some argument lists
   Initialize restart file fields (dart_to_aether) with NF90_FILL_REAL
     so that halos of variables which have non-meaningful data
     will appear that way.

This code runs and creates a correct filter_input_0001.nc file.

TODO:
  Implement these changes in dart_to_aether.
  Decide whether the QTYs assigned to ion velocities are optimal.
  Permute the dimensions from Aether's (lon,lat,z) (z varies first)
    with CF (z,lat,lon) (lon varies first).
  Debug why some character dimensions couldn't use NF90_MAX_NAME
    and needed 256 instead.
  Remove extraneous code; debugging prints and unused variables.
Similar to the previous commit of aether_to_dart.
Some minor changes to dart_to_aether to be consistent.

dart_to_aether
   Moved namelist read into model_mod:static_init_blocks.
   Removed arguments from netcdf_to_restart_files,
     which carried file info from the namelist.
dart_to_aether_nml
aether_to_dart_nml
   Now have consistent contents, including filter_io_root
     which is used to create the DART file name.
model_mod
   Handles *_to_* namelists consistently.
   static_init_blocks; takes a namelist name to read either
     dart_to_aether_nml or aether_to_dart_nml.

Both *_to_* programs passed the visual tests.
The Aether restart files, after the conversion back and forth,
   have the expected "missing values" in the halos,
   instead of pretend data.

TODO;
   Transpose dimensions in the filter_io files to be CF-compliant.
   >>> This will probably be done in Aether, so model_mod will need
       to be updated.
   Clean out junk.
   Merge with Ben's pieces of model_mod.
Worked through list from compiler.
Kept stuff used in subroutines Ben is developing
or that may be needed for TEC, f10.7.
Converted many print* to error handler.
Ben merged the model_mod used to develop aether_to_dart and dart_to_aether
into the model_mod developed to do the filter tasks; interpolation, ...
That was developed from the model_mod template, with changes to comply
with the DART style manual, so there are many non-algorithm differences
from the previous *_to_* model mod committed to branch 'aether'.
The merged version from Ben didn't quite pass model_mod_check.

Besides the cosmetic changes, this merged version:
+ Reads variables (2D array) directly from namelist; no variable_table.
+ Fixes the dimension order in 'variables' to be (cols,fields)
  so that values from the namelist will be read correctly.
+ Made variables() have 6 columns, to include the source file root.
+ Replaced lengths of variables()' strings and related variables
  with types_mod:vtablenamelength (instead of NF90_MAX_NAME)
+ Replaced hard-coded dimension names 'lon', 'lat', 'z'
  with already defined parameters {LON,LAT,LEV}_{DIM,VAR}_NAME.
+ Added do_output() conditionals around messages in the *_to_* procedures.
+ Removed:
  - not-yet-used code dealing with TEC and f10.7,
  - routines from TIEGCM which are not needed,
  - suggestion of a derived type to handle the various file types,
+ Added the time (UNLIMITED) dimension and variable to filter*.nc files.
  Misguidedly, this was done by adding 'time' to nc_write_model_atts,
  which works fine for aether_to_dart to create filter_input files.
  But it breaks model_mod_check, because create_and_open_state_output
  also defines the time dimension (also UNLIMITED), which is not allowed.
  So it seems that in aether_to_dart we need to replace the direct use
  of nc_write_model_atts for creating filter_input.nc.  Instead,
  define a domain (add_domain_from_file, since I have a filter_input
  with the right dimensions), which is then used by a new call
  to write_state (... create_and_open_state_output)

Model_mod_check, aether_to_dart, and dart_to_aether all compile.
Aether_to_dart and dart_to_aether work, but see "time", above.

aether_to_dart.f90
   Removed unneeded 'use' procedures
aether_to_dart.nml, dart_to_aether.nml
   Added debug
model_mod.f90
   See above
model_mod.nml
   Removed f10_7 variables, calendar.
   filter_io_dir -> filter_io_filename (should probably be filter_io_root)
   assimilation_period_seconds -> time_step_{days,seconds}
   Replaced Aether O+ with DART's Opos
transform_names.f90
   New program to test code to make variable names CF compliant
work/input.nml
   See model_mod.nml
   replaced model_mod_check_nml contents with current and Aether variables
work/input.nml
work/quickbuild.sh
   added transform names
This matlab script creates new block restart files
with the names Aaron provided in mid-January, 2024.
The data are the same as the input (old names) files.
This is a commit of the the development script,
which converts only 2 neutrals and 2 ions for the
requested number of block(s) and member(0-based).
Some of the variable names in the ensemble Aaron sent
had already been updated, compared to the restartOut.Sphere.1member...
and O was added to the neutrals.
This script converts all of the variables and was applied to 20 members.
Abandoned using domains for *_to_* (designed for ensembles).
Nc_write_model_atts should not define dimensions, especially
an UNLIMITED time.  That's done by create_and_open_state_output.
Instead, the *_to_* programs should write a time scalar variable
and not define a time dimension.

This commit was tested using aether_to_dart, dart_to_aether,
and tests 1-5,7 in model_mod_check.  It is a functional baseline
before making style and cosmetic changes for the code reviews.
Also still to do:
  update rst files,
  merge with any updates Ben has made to the assimilation routines,
  resolve the QTY issue.

model_mod.f90
  Extracted creation and filling of dimension variables
  from nc_write_model_atts to a new def_fill_dimvars.
  Removed time dimension from temp arrays and calls to nc_put_variable.
  Added 'units' attribute to the filter_input.nc files.
aether_to_dart.nml
dart_to_aether.nml
work/input.nml
  New Aether field names (complete list) and lists which Aaron says
  will be important in the state vector.
transform_names.f90
  removed because it's outdated
issue_QTYs
  File describing the QTY situation for space weather variables
  as of 2024-1-25.  Current testing doesn't require resolution,
  but future uses probably do.
comments containing KDR and NEWIC
Trimmed and updated comments about TODO, f107, TEC, GITM, TIEGCM.
Indenting, spaces, alignment, routine separation.
Add descriptions of routines to their headers.
Eliminate doxygen code.
Replace hard coded routine name args with local variable 'routine'
remove unused procedures from 'use'
   check_use_routines.csh
Check routine local variables for definition with a value,
dimension(:,...) :: variable   -> :: variable(:,...)
Named loops if they're > 1 page, have a 'cycle' or 'exit',
   or loop over non-array-indices
error handler character output should be formatted with '(A)', not *
Capitalization; replace CamelCase with multi_word
constants need _r8

Still to do
   Capitalization
?     parameters can (should?) be ALL_CAPS.
         Not consistent in DART (types_mod has ALL_CAPS, ALLCAPS,
         alllowercase and multi_word)
   use parameters from types_mod.f90;
     varnamelength = 31, metadatalength   = 64
?    Are these up-to-date with the fortran we're using?
?    Should we use parameters that have the right value in contexts
       where the name is misleading?
       character(len=varnamelength)     :: calendar = 'GREGORIAN'
     Or define additional length parameters for this module
       in the global defs?
   filter_to_restarts: no nc_count?, add error message
   Look into precision of dimension vars (lons, lats, levs)
because it's reading whole arrays from filter_output.nc
(into subsections of larger arrays).
Removed unused variables from dart_to_aether.f90.

Tested in aether_to_dart, dart_to_aether, and model_mod_check.
NF90_FILL_REAL is used to fill the halo regions of Aether with values
that NetCDF will recognize as "not data".
It's a pass-through variable in netcdf_utilities.
@johnsonbk
Copy link
Collaborator

johnsonbk commented Feb 14, 2024

Thanks for everyone's work on this pull request. I ran into two problems following the directions in readme.rst.

First problem: model_mod_check test number 2 fails

Starting with a fresh clone of DART

cd <abs path to local DART installation>
git clone https://github.com/NCAR/DART.git DART
cd DART/
git fetch origin
git switch aether
cd build_templates
mv <suitable mkmf.template> mkmf.template
cd ../models/aether_lat-lon/work
sftp <user>@data-access.ucar.edu
get /glade/work/raeder/Exp/Aether/Ens_renamed_derecho.tgz
exit
tar -zxvf Ens_renamed_derecho.tgz
cp Ens_renamed_derecho/filter_input_0001.nc ./
./quickbuild.sh
# The executables compile
./model_mod_check
...
***************** RUNNING    TEST 2    ***********************
 -- Read and write restart file
**************************************************************
--------------------------------------------------------------
 Reading File : filter_input_0001.nc
--------------------------------------------------------------
 ERROR FROM:
  source : dart_time_io_mod.f90
  routine: read_model_time:
  message:  inconsistent calendar types between DART program and input file.
  message: ...  DART initialized with: NO_CALENDAR File uses: GREGORIAN
  message: ...  You may need to supply a model-specific "read_model_time()" to read the time.

Perhaps this failure is related to commit 3c7d8d?

Second problem: aether_to_dart doesn't use the aether_restart_dirname namelist entry in open_block_file

While trying to run aether_to_dart:

pwd 
<abs path to local DART installation>/DART/models/aether_lat-lon/work
vim input.nml
# Edit input.nml to set
aether_restart_dirname = 
      '<abs path to local DART installation>/DART/models/aether_lat-lon/work/Ens_renamed_derecho/'

# Continuing with the directions in readme.rst:
cd <aether_restart_dirname>
mkdir Orig
cp *m0000* Orig/

The following command to change directory back to the work directory is not in the directions in readme.rst but that's where aether_to_dart is compiled.

cd <abs path to local DART installation>/DART/models/aether_lat-lon/work
./aether_to_dart  0
...
ERROR FROM:
  source : aether_lat-lon/transform_state_mod.f90
  routine: open_block_file
  message: cannot open file grid_g0000.nc 
                                                                                                                   for read

If aether_to_dart and input.nml are in the aether work directory and the aether netcdf files are in a different directory, <aether_restart_dirname> this error is thrown.

Attempt number 1 to fix this error involves copying the grid files to the work directory

cp <aether_restart_dirname>/grid_g000?.nc ./
./aether_to_dart 0
...
ERROR FROM:
  source : aether_lat-lon/transform_state_mod.f90
  routine: open_block_file
  message: cannot open file neutrals_m0000_g0000.nc                                                                                                                                                                                                                                          for read

Attempt number 2 to fix this error involves copying the aether_to_dart executable and input.nml to <aether_restart_dirname>.

cp aether_to_dart <aether_restart_dirname>
cp input.nml <aether_restart_dirname>
cd <aether_restart_dirname>
./aether_to_dart
...
aether_to_dart Successfully converted the Aether restart files to 'filter_input_0001.nc'
 
 --------------------------------------
 Finished ... at YYYY MM DD HH MM SS = 
                 2024  2 14 11 23 29
 --------------------------------------

The cause of this unexpected behavior seems to be that the function open_block_file is not passed a filename including the path of <aether_restart_dirname>. The executable only works if it is copied to <aether_restart_dirname> or if <aether_restart_dirname> is the same directory as the aether work directory. The reason this behavior is unexpected is that aether_restart_dirname is set in transform_state_nml so a user would expect that aether_to_dart would actually use that namelist entry as the directory in which the restart files are stored.

Co-authored-by: Helen Kershaw <20047007+hkershaw-brown@users.noreply.github.com>
@hkershaw-brown
Copy link
Member

...
***************** RUNNING TEST 2 ***********************
-- Read and write restart file



Reading File : filter_input_0001.nc

ERROR FROM:
source : dart_time_io_mod.f90
routine: read_model_time:
message: inconsistent calendar types between DART program and input file.
message: ... DART initialized with: NO_CALENDAR File uses: GREGORIAN
message: ... You may need to supply a model-specific "read_model_time()" to read the time.


Perhaps this failure is related to commit 3c7d8d?

This is my bad, I meant force the calendar to be Gregorian rather than having a variable calder (implying a user choice). I'll go ahead and fix this.

@kdraeder
Copy link
Contributor Author

@johnsonbk Thanks for working through the file transformation instructions.
I'll work on that, unless you already started fixing the issues.

The cause of this unexpected behavior seems to be that the function open_block_file is not passed a filename including the path of <aether_restart_dirname>. The executable only works if it is copied to <aether_restart_dirname> or if <aether_restart_dirname> is the same directory as the aether work directory. The reason this behavior is unexpected is that aether_restart_dirname is set in transform_state_nml so a user would expect that aether_to_dart would actually use that namelist entry as the directory in which the restart files are stored.

@johnsonbk
Copy link
Collaborator

I haven't begun to start fixing the transform_state bug, @kdraeder. Please proceed.

@kdraeder
Copy link
Contributor Author

Second problem: aether_to_dart doesn't use the aether_restart_dirname namelist entry in open_block_file

This brings up a strategy question, which I thought about at the beginning, but lost track of.
We've decided not to develop full cycling scripting at this time,
so we don't have that context to define the use of aether_to_dart, filter, and dart_to_aether.
The experiment structure I'm most familiar with has the following:

  1. All of the DART executables are copied to an exec_dir.
  2. All of the ensemble model output is written to a run_dir and has a full date in the filenames. (The Aether restart file names do not have the date in them, so the scripting will need to prevent overwriting of files which need to be kept.)
  3. The assimilation script runs the executables in the run_dir using full pathnames of the executables.
  4. All of the DART output is written to the run_dir.
  5. The updated state variables are written into the existing model restart files.

An advantage of this is that the run_dir does not need to be specified in the DART executables
because the scripting changes to that directory before running the executables.
They only need to know file names. So far I don't see any disadvantages.

Is this the structure we want here?
Or do we want the executables to be able to find the restart files wherever they are?
And write their output to an output location, which may be different from the input files location?

@nancycollins
Copy link
Collaborator

the wrf folks seem to prefer putting each ensemble member's input and output in a separate subdir, probably because that makes the model advances easier to run in parallel by letting each member run in their own subdirectory without interfering with each other.

as long as it doesn't complicate the code too much, enabling flexible options for how someone wants to run both the assimilation and the model advances would be good. i know in cesm that the model advances can run in parallel using cesm options, but that may not be true for aether.

also, you probably don't want to assume that the scripting is always going to do a model advance before an assimilation, like cesm does. it's very useful for exploring parameter/namelist settings or debugging to run filter alone from a simple batch script without any other scripting requirements, as long as all the files are in the correct place with the correct names.

having said all that, it sounds like getting something running is the highest priority, so whatever works quickly might be the strategic thing here.

Added aether_restart_dirname to all filenames that are opened.
Removed aether_restart_dirname (and local names) from argument lists
because it's a global variable.
Updated readme.rst with better instructions for testing *_to_*
and clarifications.
Made the default aether_restart_dirname = '.' so that it can be
left out of the namelist when running the executables
in a run directory where the restart files are.

Passed file conversion tests and model_mod_check tests 1-5,7.
@kdraeder
Copy link
Contributor Author

I opted to make aether_to_dart and dart_to_aether use the same directory for the Aether and filter restart files,
but allow those programs to executed from any directory.
I tested and pushed those commits.


::

> set exec_dir = $DART/models/aether_lat-lon/work
Copy link
Member

@hkershaw-brown hkershaw-brown Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

say this is c shell

Comment on lines +83 to +86
if (num_args == 0) then
write(error_string_1,*) 'Usage: ./aether_to_dart member_number (0-based)'
call error_handler(E_ERR, progname, error_string_1)
endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess whoever runs this would need to write a loop to generate all filter restarts one by one. Any reason why not generate all members internally? Is there a scenario where a user doesn't need to create all members?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We needed working code as soon as possible and a single member program was simpler to write.
It's possible to run multiple aether_to_darts in parallel, so it shouldn't be much slower than
having aether_to_dart handle all members at once.
There are testing and development scenarios which only need 1 member, and perfect_model_obs.

Comment on lines +222 to +224
allocate(temp3d(1:nz_per_block, &
1-nghost:ny_per_block+nghost, &
1-nghost:nx_per_block+nghost))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to ncview individual members are running aether_to_dart but the fields don't show up as I expect. I was hoping to see a global field and then change through the levels. However, I could only change through the lon dimension which is not very useful. This is because the dimensions of the 3D variables are ordered as lon, lat, alt. I ran a simple 'ncpdq -a alt,lon filter_input_0001.nc filter_input_0001_swap.nc' on the file to permute the dimensions and then I was able to see the entire globe at different levels. I suggest modifying aether_to_dart to adjust the dimension order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coordinate order is annoying. Aether won't change, and we decided to use it in order to make the conversion programs simpler to code and run faster (no transposing).
Ncview can show a lat-lon slice by changing the axes. But the lat and lon are transposed, so it's not perfect. Other common viewing tools can transpose it.

! http://www.image.ucar.edu/DAReS/DART/DART_download
!

program aether_to_dart
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I run aether_to_dart with an argument that is larger than the ensemble size, it fails. Yet, it creates a filter_input netcdf file including only dimension variables. I think it should fail before creating any file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll try to fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is to read the model time from the actual member instead of member 0.
It will fail before the filter_input.nc file is created.

Comment on lines 133 to 135
template_file = 'filter_input_0001.nc'
variables = 'Temperature', 'QTY_TEMPERATURE', 'NA', 'NA', 'UPDATE',
'Opos', 'QTY_DENSITY_ION_OP', 'NA', 'NA', 'UPDATE'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need any clamping? maybe for O+?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could certainly make 0.0 be the lower limit for the ion densities and temperatures, since we expect that inflation will sometimes make those be negative. Or leave that as an exercize for the user.
Whether it's needed or not depends on whether Aether and the assimilation setup work well together. One advantage of not clamping (especially the max values) is that it helps users see when an assimilation is failing, instead of silently correcting itself.

/

&assim_tools_nml
cutoff = 0.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the cutoff, is 0.2 the default that we want users to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that this is a science question the users will need to answer.
0.2 has been reasonable in other global models, but if @johnsonbk has any insight, we can set a different default value.

&perfect_model_obs_nml
read_input_state_from_file = .true.
single_file_in = .false.
input_state_files = 'wrfinput_d01'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in pmo: the input_state_files is set to 'wrfinput_d01'. We probably need to change that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will need to be changed, but pmo is not in the scope of this stage of development,
so we haven't looked at that. I'll replace wrfinput_d01 with "pmo not ready for use".

qceff_table_filename = ''
/

&quality_control_nml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add outlier_threshold = 3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that, and 'input_qc_threshold = 3.0'. This input.nml hasn't been used directly for assimilation, so it may be missing other variables, or have suboptimal values.
@johnsonbk may also have an input.nml that's been updated to do an assimilation.

Comment on lines 83 to 93
inf_flavor = 0, 0,
inf_initial_from_restart = .false., .false.,
inf_sd_initial_from_restart = .false., .false.,
inf_deterministic = .true., .true.,
inf_initial = 1.0, 1.0,
inf_lower_bound = 0.0, 1.0,
inf_upper_bound = 1000000.0, 1000000.0,
inf_damping = 1.0, 1.0,
inf_sd_initial = 0.0, 0.0,
inf_sd_lower_bound = 0.0, 0.0
inf_sd_max_change = 1.05, 1.05,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set inf_flavor to 5. Also, change inf_sd_initial and inf_sd_lower_bound to 0.6

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@nancycollins nancycollins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just left a comment on the change to the dart version of the netcdf utilities cover functions. it's not important for getting aether running, but sometime in the future someone should go back and remove all the NF90 constants and make better named functions that either return those values or do the code that uses the constants.

@@ -63,7 +63,7 @@ module netcdf_utilities_mod
nc_begin_define_mode, &
nc_end_define_mode, &
nc_synchronize_file, &
NF90_MAX_NAME, NF90_MAX_VAR_DIMS
NF90_MAX_NAME, NF90_MAX_VAR_DIMS, NF90_FILL_REAL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure when the NF90 constants got added to this fie, but the intent was never to replicate the actual netcdf utilities mod. adding more constants to this is just going down the wrong path. these functions were intended to be more user-friendly ways to access netcdf without having to get into the weeds of the netcdf interface. if you need these constants in your code, 'use netcdf' or make a cover function that has the name of what you're doing and uses this constant inside the function. e.g. from the name i'm guessing it fills an array. make a
nc_fill_array_blank(array) that doesn't require netcdf constants in the calling code.

i know this isn't an important fix, but it's a long-term bad idea. maybe after the crush of getting this working is done, someone can go back and make this better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I didn't do much work to figure out the existing strategy for using netcdf parameters.
The template I was working from used some, so I assumed it was not terrible to do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is the only reason i brought it up. once the camel's nose is in the tent...
it's not important for this pull request and a later change to your code will be easy. i will open an issue on this because i'm not requesting a change to this pull request, so this is the wrong place to address this.

kdraeder and others added 2 commits March 12, 2024 09:07
Made it fail before filter_input.nc is created when the Aether restart
  set doesn't exist.
Updated input.nml with useful values for an assimilation.
Deleted matlab function used to import new Aether field names
  into old Aether restart files.  Aether_to_dart works with
  the new names, so this script will probably never be used again.
@mgharamti
Copy link
Contributor

@kdraeder I tested your changes. I am still getting a restart file even though the argument is larger than the ensemble size. I was expecting it to fail before creating a new netcdf file. And btw, we need the same logic applied to dart_to_aether.
This, however, is not a deal breaker and I'm OK to move forward with the PR in its current form.

@mgharamti
Copy link
Contributor

Great, thanks Kevin. This works in both directions now!

Copy link
Member

@hkershaw-brown hkershaw-brown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moha and Kevin in person approved!

@hkershaw-brown hkershaw-brown merged commit 2ce02fb into main Mar 12, 2024
4 checks passed
@hkershaw-brown hkershaw-brown deleted the aether branch March 14, 2024 17:40
hkershaw-brown added a commit that referenced this pull request Oct 31, 2024
and to supported models
Also added aether_lat-lon (missed from pull #635
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Develop Aether lon-lat interface
5 participants