-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert verification to solely use METplus configuration files (feature/ens_design_RRFS) #695
Convert verification to solely use METplus configuration files (feature/ens_design_RRFS) #695
Conversation
…ua files to hold dirs.
…from user-provided MET config files.
…everal GridStat conf files.
…at METplus conf files.
@@ -146,11 +146,11 @@ export LOG_SUFFIX | |||
# | |||
if [ ${VAR} == "APCP" ]; then | |||
acc="${ACCUM}h" # for stats output prefix in EnsembleStatConfig | |||
${METPLUS_PATH}/ush/master_metplus.py \ | |||
${METPLUS_PATH}/ush/run_metplus.py \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michelleharrold I know we discussed error checking in a meeting before, but I wanted to understand the details (which I've forgotten) of why we don't want to use error-checking here. By error-checking, I mean adding to this call a print_err_msg_exit
statement if it fails, e.g.
${METPLUS_PATH}/ush/run_metplus.py \
-c ${METPLUS_CONF}/common.conf \
-c ${METPLUS_CONF}/EnsembleStat_${VAR}${acc}.conf || print_err_msg_exit "\
Call to "run_metplus.py" failed with exit code $?.
"
I and other reviewers are now encouraging use of error-checking so the ex-script doesn't exit with a false success status. But for some of the verification ex-scripts, I remember you and/or @jwolff-ncar said we should not have the script fail even if the return code from METplus is nonzero. Can you explain again the reason for that? Also, are some of the other vx sripts well-behaved in the sense that we can do error-checking with them? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gsketefian - good question. I agree not having the error checking for the verification can cause some frustration when a job fails but is called successful by the workflow. @jwolff-ncar and I decided not to implement error checking for the vx jobs because we run all the forecast lead times through the verification in one job, and there are circumstances where some observations may not present (which is okay). This would cause the verification job to error out prematurely if a forecast/obs combo was not present. I am definitely open to brainstorming ways around this because having more robust error checking would certainly make testing and monitoring easier.
@michelleharrold Do you know if these changes will work on platforms besides Hera? I guess it depends on whether the version of METplus on Hera is a recent upgrade, in which case versions on other platforms will also need an upgrade. I think verification will be a supported capability of the SRW App for the next release, but @mark-a-potts may know better. If so, we might want to make the version of METplus on Hera part of the common SRW App environment (defined in |
@gsketefian - @briannen was going to also test this on Cheyenne (I will check on the status of that testing). This PR was tested with METplus v4.0.0 and MET v10.0.0; both are the current official supported releases and are available as modules on Hera and Cheyenne (as well as other platforms). |
@@ -1,42 +0,0 @@ | |||
///////////////////////////////////////////////////////////////////////////////// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michelleharrold, I noticed that Mike's grid diag PR doesn't have a GridDiagConfig, but instead GridDiag_2vars/REFC/RETOP instead. Do we still want to get rid of this file? Just want to confirm!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeffBeck-NOAA - ah yes, good question. I confirmed with @jwolff-ncar (and also commented on @mkavulich PR #630) that we can delete this file and that Mike will handle the necessary changes to Grid-Diag. We do not use Grid-Diag in the standard verification of the workflow, so nothing will break as a result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @michelleharrold, that sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this all looks good. The only remaining question is whether we should use a QC value of 3 or 9. For now, I think it should be merged with 9 (matches old behavior) and we can update after we finish our sensitivity testing in a new PR as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me and testing was thorough!
DESCRIPTION OF CHANGES:
Recently, changes have been made to the METplus framework to remove the need for user-provided MET configuration files (can instead leverage the MET configuration files within the METplus code base); now, all MET/METplus-related configuration options can be provided via METplus conf files. This PR removes the need for MET config files under: regional_workflow/ush/templates/parm/met by shifting to only using METplus conf files under: regional_workflow/ush/templates/parm/metplus
When transitioning to only using METplus conf files in the workflow verification, the METplus conf files underwent a major overhaul, using the example use cases in the METplus v4.0.0 release as templates.
In addition, several minor changes were made to the METplus conf files to address minor syntax errors and to address consistency issues with some values (e.g., ci_alpha and seed values) across different conf files.
Changes were also made to the scripts/exregional_run*vx.sh to transition calling master_metplus.py to run_metplus.py.
As a note, the regional_workflow/ush/templates/parm/met/GridDiagConfig file has been removed (this file is not used within the XML workflow execution). There is currently an open PR (#630) that @mkavulich is working on that can address any necessary changes to the METplus GridDiag conf file as a result of this PR.
TESTS CONDUCTED:
This PR was tested with METplus v4.0.0 and MET v10.0.0 on Hera. A baseline run was conducted with the FV3_GFS_v15p2 physics suite over the RRFS_CONUS_25km domain for a 48-h forecast, initialized on 20190615 at 00 UTC using the original MET/METplus configuration files. After changes were made to remove MET config files under: regional_workflow/ush/templates/parm/met, a second run was completed and compared against the baseline run to ensure no changes to the verification were made due to the reconfiguration of exclusively using METplus conf files. Note: For the final changes to the METplus conf files included in this PR, there will be some differences when compared against baseline runs due to changes to values like ci_alpha and due to fixing the value of the random seed for observation error calculations, which used to vary from run to run, producing slightly different results every time.
In addition, the WE2E tests were successfully completed for the MET_verification and MET_ensemble_verification tests.
DOCUMENTATION:
Documentation for verification is currently in progress and will be added in the short-term.
ISSUE (optional):
Addresses issue mentioned in #694.
CONTRIBUTORS (optional):
@jwolff-ncar, @willmayfield, and @evankalina contributed to and tested this PR.