-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Add smoke and dust verification #1174
base: develop
Are you sure you want to change the base?
Conversation
…les, that are expected to be created once the task is finished actually get created. This is needed because it is possible that for some forecast hours for which there is overlap between cycles, the files are being retrieved and processed by the get_obs_... task for another cycle.
…nd EnsembleStat tasks such that GenEnsProd does not depend on the completion of get_obs_... tasks (because it doesn't need observations) but only forecast output while EnsembleStat does.
…d due to changes to dependencies of GenEnsProd tasks in previous commit(s).
…tending to time out for 48-hour forecasts.
…sure PcpCombine operates only on those hours unique to the cycle, i.e. for those times starting from the initial time of the cycle to just before the initial time of the next cycle. For the PcpCombine_obs task for the last cycle, allow it to operate on all hours of that cycle's forecast. This ensures that the PcpCombine tasks for the various cycles do not clobber each other's output. Accordingly, change the dependencies of downstream tasks that depend on PcpCombine obs output to make sure they include all PcpCombine_obs tasks that cover the forecast period of the that downstream task's cycle.
…ossibly also get_obs_ndas by putting in sleep commands.
- There are lots of task-specific checks that always run regardless of task inclusion: add some checks there so that we don't have to include unnecessary variables like PREDEF_GRID_NAME in vx-only experiments - There were a few task-specific checks that DO check for task inclusion, but the checks were broken: fix those - Move dict_find from an inline function in setup.py to a proper external python function
task-dependent logic checks - Break out all FV3 namelist logic out into a new function, setup_fv3_namelist - Only call this new function if the run_fcst task is active - Delay exporting of variables further down the page (need to completely eliminated this eventually) - Replace some *_vrfy commands with their proper versions - Eliminate some unnecessary variables and block comments
…, need to create observation directories if they don't exist
…not specified, include correct valid VX_FIELDS for new variables
…w; more tasks to come!
- New metplus conf file - New J-job and exscript for new task - New task entry in wflow/verify_pre.yaml - New variables for obs filenames and ASCII2NC output filenames - New entries in various scripts for new task - ush/get_metplus_tool_name.sh - ush/setup.py - ush/set_vx_fhr_list.sh - Updating some comments - Stage test observations on disk for faster testing
- Add PM10 as a valid ob type - Update PcpCombine.conf template to allow obs other than CCPA, USER_DEFINED command - Fix task name for ASCII2NC - Add PCPCombine tasks for PM - Fix check of airnow ob file name in exregional_get_verif_obs.sh - ASCII2NC doesn't need beta version of MET - Update some comments in config_defaults.yaml - Pythonize ush/set_vx_fhr_list.sh with help from ChatGPT; this results in an insane speedup (100 seconds to check forecast files --> ~ 1 second)
importing the necessary METplus functions directly. This will need some attention before merging to ensure it is platform-independent, only working on Hera for now. But the smoke stuff is Hera-specific for now regardless.
don't get any matched pairs. However, it seems as if the example case has the same issue, so I'll need to figure out what's going on there. - Update vx_config_det.yaml for correct obs names - Update verify_det.yaml to make the PointStat metatask loop over obtypes, so we can combine NDAS with smoke vx - Add PM10 to ASCII2nc_obs - Remove verbose flag from set_vx_fhr_list.py call in exregional_check_post_output.sh so we get correct FHR results - Update exregional_run_met_gridstat_or_pointstat_vx.sh uses beta release, can handle smoke vx obtypes for PointStat - - Remove deleted script from ush/source_util_funcs.sh
files are unique! Also, make the metatask rules for ASCII2nc simpler
- Produce hourly nc obs files for AOD - Probably doesn't make a difference, but explicitly reference AOD as "AERONET_AOD" in POINT_STAT_MESSAGE_TYPE
…from RRFS is 550 nm. This gets us matched pairs!
@MichaelLueken I could also pull this |
Yeah, in improving the symlinking in EPIC does have the fix files in a S3 bucket - https://noaa-ufs-srw-pds.s3.amazonaws.com/index.html#develop-20240618/fix/. It would be interesting to see if pulling from the S3 bucket and updating the |
I would recommend attempting to pull the fixed files from the S3 bucket and seeing if the tests will pass using the pulled files. If everything works as expected, then we can move forward. Should the tests continue to fail even with the retrieved fixed files, then the |
…slashes in SED string
…smoke/dust WE2E test to testing suites (ufs-community#1195) * Update weather model hash to 8933749 from February 19, 2025 * Add smoke_dust_grid_RRFS_CONUS_3km_suite_HRRR_gf WE2E test to coverage.gaea-c6, comprehensive and comprehensive.orion (sym linked to comprehensive.gaea-c6 and comprehensive.hercules) * Update modulefiles/build_hera_gnu.lua to allow smoke and dust WE2E test to run on Hera GNU. * Address documentation failures now that https://www.fvcom.org site's security certificate has been renewed * Change gaea to gaeac5 and gaea-c6 to gaeac6 throughout to address node name change in Jenkins * Remove Jet support in Jenkins --------- Co-authored-by: EdwardSnyder-NOAA <Edward.Snyder@noaa.gov>
@MichaelLueken I wasn't able to retrieve the fix file data because there wasn't enough disk space on the standard GitHub runner. I was able to get the tests to pass by creating "dummy" directories on the runner and modifying the machine file to point to those. See the changes to the first two files here: https://github.com/ufs-community/ufs-srweather-app/pull/1174/files/d30a8ce5364c98ae9dbbbf0e8939d16c35f567a4..644dbda46e4407ee959d1e9105b01847013b1a0b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent job on finding a fix for the generate_FV3LAM_wflow unit test!
I only have two minor issues:
- The changes in
run_vx.local.lua
should also be applied tomodulefiles/tasks/gaeac5
. - The changes in the machine files should also be applied to
ush/machine/gaeac6.yaml
.
I was able to successfully run the new MET_verification_smoke_only_vx
WE2E test on Hercules:
----------------------------------------------------------------------------------------------------
Experiment name | Status | Core hours used
----------------------------------------------------------------------------------------------------
MET_verification_smoke_only_vx_20250224104122 COMPLETE 0.52
----------------------------------------------------------------------------------------------------
Total COMPLETE 0.52
Will approve once modulefiles/tasks/gaeac5/run_vx.local.lua
and ush/machine/gaeac6.yaml
have been updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @mkavulich!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some change requests for minor changes that should be made and a few others that are optional. A few are questions that could lead to a new comment line. I don't need to review it again.
AIRNOW observations can be retrieved from AWS in addition to HPSS, but this requires changing some default settings. | ||
See ``ush/config_defaults.yaml`` or :numref:`Section %s <GeneralVXParams>` for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIRNOW observations can be retrieved from AWS in addition to HPSS, but this requires changing some default settings. | |
See ``ush/config_defaults.yaml`` or :numref:`Section %s <GeneralVXParams>` for more details. | |
AIRNOW observations can be retrieved from AWS or HPSS, but retrieving from AWS requires changing some default settings. | |
See ``ush/config_defaults.yaml`` or :numref:`Section %s <GeneralVXParams>` for more details. |
@@ -633,7 +633,7 @@ Pre-Existing Directory Parameter | |||
|
|||
* **"delete":** The preexisting directory is deleted and a new directory (having the same name as the original preexisting directory) is created. | |||
|
|||
* **"rename":** The preexisting directory is renamed and a new directory (having the same name as the original pre-existing directory) is created. The new name of the preexisting directory consists of its original name and the suffix "_old###", where ``###`` is a 3-digit integer chosen to make the new name unique. | |||
* **"rename":** The preexisting directory is renamed and a new directory (having the same name as the original pre-existing directory) is created. The new name of the preexisting directory consists of its original name and the suffix "_old_YYYYMMDD_HHmmss", where ``YYYYMMDD_HHmmss`` is the full date and time of the rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems incomplete.
@@ -12,6 +12,12 @@ Glossary | |||
advection | |||
According to the American Meteorological Society (AMS) definition, `advection <https://glossary.ametsoc.org/wiki/Advection>`_ is "The process of transport of an atmospheric property solely by the mass motion (velocity field) of the atmosphere." In common parlance, advection is movement of atmospheric substances that are carried around by the wind. | |||
|
|||
AERONET | |||
The "`AErosol RObotic NETwork <https://aeronet.gsfc.nasa.gov/>`_": A worldwide ground-based remote sensing aerosol networks established by NASA and PHOTONS. The SRW verification tasks can use "Level 1.5" (cloud-screened and quality-controlled) aerosol optical depth observations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "`AErosol RObotic NETwork <https://aeronet.gsfc.nasa.gov/>`_": A worldwide ground-based remote sensing aerosol networks established by NASA and PHOTONS. The SRW verification tasks can use "Level 1.5" (cloud-screened and quality-controlled) aerosol optical depth observations. | |
The "`AErosol RObotic NETwork <https://aeronet.gsfc.nasa.gov/>`_": A worldwide ground-based remote sensing aerosol network established by NASA and PHOTONS. The SRW verification tasks can use "Level 1.5" (cloud-screened and quality-controlled) aerosol optical depth observations. |
The "`AErosol RObotic NETwork <https://aeronet.gsfc.nasa.gov/>`_": A worldwide ground-based remote sensing aerosol networks established by NASA and PHOTONS. The SRW verification tasks can use "Level 1.5" (cloud-screened and quality-controlled) aerosol optical depth observations. | ||
|
||
AIRNOW | ||
A North American ground-level air quality measurement network. The SRW verification tasks can use PM2.5 and PM10 observations. More information available at https://www.airnow.gov/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a hyperlink?
@@ -159,7 +165,8 @@ Glossary | |||
The `Modern-Era Retrospective analysis for Research and Applications, Version 2 <https://gmao.gsfc.nasa.gov/reanalysis/MERRA-2/>`__ provides satellite observation data back to 1980. According to NASA, "It was introduced to replace the original MERRA dataset because of the advances made in the assimilation system that enable assimilation of modern hyperspectral radiance and microwave observations, along with GPS-Radio Occultation datasets. It also uses NASA's ozone profile observations that began in late 2004. Additional advances in both the GEOS model and the GSI assimilation system are included in MERRA-2. Spatial resolution remains about the same (about 50 km in the latitudinal direction) as in MERRA." | |||
|
|||
MET | |||
The `Model Evaluation Tools <https://dtcenter.org/community-code/model-evaluation-tools-met>`__ is a highly-configurable, state-of-the-art suite of verification tools developed at the :term:`DTC`. | |||
METplus | |||
The `Model Evaluation Tools <https://dtcenter.org/community-code/model-evaluation-tools-met>`__ is a highly-configurable, state-of-the-art suite of verification tools developed at the :term:`DTC`. `METplus <https://dtcenter.org/community-code/metplus>`_ is a suite of python wrappers providing low-level automation of the MET tools. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The `Model Evaluation Tools <https://dtcenter.org/community-code/model-evaluation-tools-met>`__ is a highly-configurable, state-of-the-art suite of verification tools developed at the :term:`DTC`. `METplus <https://dtcenter.org/community-code/metplus>`_ is a suite of python wrappers providing low-level automation of the MET tools. | |
The `Model Evaluation Tools <https://dtcenter.org/community-code/model-evaluation-tools-met>`__ is a highly configurable, state-of-the-art suite of verification tools developed at the :term:`DTC`. `METplus <https://dtcenter.org/community-code/metplus>`_ is a suite of Python wrappers providing low-level automation of the MET tools. |
# | ||
'metplus_tool_name': '${metplus_tool_name}' | ||
'MetplusToolName': '${MetplusToolName}' | ||
'METPLUS_TOOL_NAME': '${METPLUS_TOOL_NAME}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are all three of these needed? Seems like the same information repeated a lot.
# with 0 quiet and 5 loudest. | ||
# Logging verbosity level used by METplus verification tools. 0 to 9, | ||
# with 0 having the fewest log messages and 9 having the most. Levels 5 | ||
# and above can result in very large log files and slower tool execution.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# and above can result in very large log files and slower tool execution.. | |
# and above can result in very large log files and slower tool execution. |
# TODO: Reference all these variables in their respective | ||
# dictionaries, instead. | ||
import_vars(dictionary=flatten_dict(expt_config)) | ||
export_vars(source_dict=flatten_dict(expt_config)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YESSSS!
# TODO: Reference all these variables in their respective | ||
# dictionaries, instead. | ||
import_vars(dictionary=flatten_dict(expt_config)) | ||
export_vars(source_dict=flatten_dict(expt_config)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some similar clean up changes in my PR. I find that it's safer to limit the sections that you export here and you can avoid flattening for this section for the most part with:
export_vars(source_dict=expt_config["global"])
TEST_NOHRSC_OBS_DIR: '{{ platform.WE2E_TEST_DATA }}/obs_data/nohrsc/proc' | ||
TEST_AERONET_OBS_DIR: '{{ platform.WE2E_TEST_DATA }}/obs_data/aeronet' | ||
TEST_AIRNOW_OBS_DIR: '{{ platform.WE2E_TEST_DATA }}/obs_data/airnow' | ||
DOMAIN_PREGEN_BASEDIR: '{{ platform.WE2E_TEST_DATA }}/FV3LAM_pregen' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
DESCRIPTION OF CHANGES:
This PR adds verification of smoke and dust observations to the SRW verification workflow. These observations come from new data sources: AERONET (aerosol optical depth) and AIRNOW (particulate matter). This necessitated adding some new tasks to the verification section of the workflow, and modifying some existing tasks. A new test,
MET_verification_smoke_only_vx
, has been added to test out these new capabilities. Major updates include:New observation types
Two new sets of observations (AERONET and AIRNOW) are now included for ingestion by verification tasks, and all the proper logic has been included for retrieving these obs from HPSS if necessary. In addition, a new capability allows for retrieval of AIRNOW obs from AWS over the internet without needing HPSS access: this can theoretically be extended to other ob types as well but the proper logic will need to be included in
parm/data_locations.yml
By default, the new observation types also report all matched pairs in the output stat files.
New MET tool: ASCII2NC
This is a new MET tool for SRW, used for converting the ASCII-based AERONET and AIRNOW obs to NetCDF that can be processed by later MET tasks. This is a new task with a new J-Job and exscript, as well as a new METplus conf file template.
Generalizing some tasks and metatasks
Some metatasks were previously hard-coded to certain observation types or other variables that needed to become more generic:
metatask_PointStat_SFC_UPA_all_mems
now has an outer loop of metatasks over the observation type, with an inner loop for each ensemble member. This was needed in order to accommodate the new observation types for the PointStat tool.MET and METplus upgrade
These new verification capabilities necessitated an update to a newer MET version, and bugs in 11.1.1 required updating further to MET 12.0.1 and METplus 6.0.0. These have been installed in all the usual places, thanks @RatkoVasic-NOAA!
Additional updates
In addition, several minor updates are included:
run_WE2E_tests.py
when parsing an XML with broken/unsatisfied dependencies; now it will properly print an informational message if jobs aren't being submitted properly instead of silently hangingNUM_MISSING_OBS_FILES_MAX
from 2 to 0; we really shouldn't have missing files for any of our tests, users can bump this up if they need toln_vrfy
withcreate_symlink_to_file
, and update the latter with wildcard functionalityush/generate_FV3LAM_wflow.py
, separated out the setting of namelist variables into its own functionush/retrieve_data.py
now creates directories if they do not existdict_find()
fromsetup.py
toush/python_utils/misc.py
for more general useType of change
REMOVE_RAW_OBS_*
variables for different observation types are consolidated into a singleREMOVE_RAW_OBS_DIRS
variableTESTS CONDUCTED:
DEPENDENCIES:
None.
DOCUMENTATION:
Added documentation to the Users Guide for new options where appropriate.
ISSUE:
None
CHECKLIST
CONTRIBUTORS (optional):
Thanks to @RatkoVasic-NOAA, @ulmononian, @christinaholtNOAA, and @gsketefian for their help and contributions