-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow for ZPPY- PCMDI for E3SM diagnostics #624
Conversation
The modification attemts to interpolate the 3D model level data to standard pressure level specified by vrt_remap_plev19.nc. This is needed by e3sm_to_cmip to convert these variables into cmip type
Develop a script pcmdi_diags.py along with pcmdi_diags.bash to run the pcmdi diagnostics using the E3SM model simulations
Thanks @zhangshixuan1987 for working on this! Let me know when it's ready for me to review. On my review, I can:
|
@forsyth2 : Hi Ryan, based on what you mentioned above, I would like to ask you to review this commit first. This pull request have already included the key part for the workflow:
I think that it would be better for you to provide some high-level comments on the overall structure and the coding style which should be first fulfilled to meet the requirment by Zppy. Thank you! With best wishes, Shixuan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhangshixuan1987 Thanks again for working on this.
Here is my review and my summary of the pull request template:
Issue resolution
- Closes Add PMP as task #478
Select one: This pull request is...
- a bug fix: increment the patch version
- a small improvement: increment the minor version
- This adds a new feature: the PMCDI Metrics task
- an incompatible (non-backwards compatible) API change: increment the major version
1. Does this do what we want it to do?
Objectives:
- Add PCMDI metrics as a new task in zppy
Required:
- Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
- The objective I gave above is at the highest-level possible. It might be nice to give a little more detail on common use cases for example.
- Testing: I have added or modified at least one "min-case" configuration file to test this change. Every objective above is represented in at least one cfg.
- Example
min_case
cfg
files can be found in https://github.com/E3SM-Project/zppy/tree/main/tests/integration (template_min_case...
files). These arecfg
files meant to test a specific piece ofzppy
-- i.e., they form a minimal testing case. These are necessary because trying to test all ofzppy
quickly leads to a combinatorial explosion in the parameter space. For a large PR like this adding a whole new feature, we should ideally have multiplemin_case
cfg
files to demo a few of the common use cases.
- Example
- Testing: I have considered likely and/or severe edge cases and have included them in testing.
- It's also helpful to include these
min_case
cfg
files for notable edge cases of user input that could break the code.
- It's also helpful to include these
If applicable:
- Testing: this pull request introduces an important feature or bug fix that we must test often. I have updated the weekly-test configuration files, not just a "min-case" one.
- In addition to the
min_case
cfg
files, since we're adding an entirely new task here, we'll want to add that to the 3 tests that run weekly onmain
:bundles
,comprehensive_v2
,comprehensive_v3
.
- In addition to the
- Testing: this pull request adds at least one new possible parameter to the cfg. I have tested using this parameter with and without any other parameter that may interact with it.
- This PR adds quite a few parameters, hence the importance of adding a few
min_case
cfg
files.
- This PR adds quite a few parameters, hence the importance of adding a few
2. Are the implementation details accurate & efficient?
Required:
- Logic: I have visually inspected the entire pull request myself.
- An initial look, but to really dive into this, I want to match the code with some output (e.g., output of demo
cfg
files).
- An initial look, but to really dive into this, I want to match the code with some output (e.g., output of demo
- Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.
- I have attempted an initial look at the entire pull request.
If applicable:
- Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in
zppy/conda
, not just animport
statement.- It does look like the
yml
files will need to be updated.
- It does look like the
3. Is this well documented?
Required:
- Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.
- The great bulk of zppy's documentation is really directed to the parameter usage (https://docs.e3sm.org/zppy/_build/html/main/parameters.html), so we'll just want to make sure
default.ini
has accurate & helpful comments in it.
- The great bulk of zppy's documentation is really directed to the parameter usage (https://docs.e3sm.org/zppy/_build/html/main/parameters.html), so we'll just want to make sure
4. Is this code clean?
Required:
- Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
- This may be a big ask.
- Pre-commit checks: All the pre-commits checks have passed.
- It looks like the pre-commit hooks are currently failing on GitHub.
If applicable:
- Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.
- Does this produce plots? A search for
plot
doesn't produce any actual plotting lines as far as I can tell. I know we had some tradeoff discussions on introducing more plotting functionality to zppy directly (adding scope to zppy thus increasing testing difficulty) vs. making a plugin "package" (more challenging to implement in the first place).
- Does this produce plots? A search for
@@ -0,0 +1,34 @@ | |||
name: zppy_dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need a new file here. https://github.com/E3SM-Project/zppy/blob/main/conda/dev.yml (dev environment) and https://github.com/E3SM-Project/zppy/blob/main/conda/meta.yaml (production environment) can be modified
@@ -0,0 +1,158 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file looks more or less adjusted from another .py
file, so from visual inspection alone this seems ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is adapted from the e3sm_diag.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhangshixuan1987 As part of #628, I did modularizing refactoring of zppy
that greatly improves code readability/understandability + testing quality. We'll probably want to update this file to more closely match the other post-#628 .py
files. I'm happy to help with that.
zppy/templates/default.ini
Outdated
|
||
[tc_analysis] | ||
# NOTE: always overrides value in [default] | ||
input_files = string(default="eam.h2") | ||
# The scratch directory | ||
scratch = string(default="") | ||
|
||
[pcmdi_diags] | ||
# See url<need to work on document later> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What document is this referring to? Is this part of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from the e3sm_diags section, so I left these comments there to remind me that I should put a link here after I worked out the documentation.
zppy/templates/default.ini
Outdated
reference_alias = string(default="pcmdi_diags/reference_alias.json") | ||
# File of cmip6 metadata | ||
cmip_metadata = string(default="e3sm_to_cmip/default_metadata.json") | ||
# File of fixed pressulre level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pressure
zppy/templates/default.ini
Outdated
# File of cmip6 metadata | ||
cmip_metadata = string(default="e3sm_to_cmip/default_metadata.json") | ||
# File of fixed pressulre level | ||
#cmip_plevdata = string(default="e3sm_to_cmip/vrt_remap_plev19.nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this commented out because it also exists in ts
? Unless you specify it at the default
level, you'd need it here too (if you actually use this parameter for the PCMDI metrics task).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this becomes as redundent after I moved this part to "ts.bash". I will follow you comments on this page and refine the code
zppy/templates/default.ini
Outdated
# End year (i.e., the last year to use) for the reference data | ||
ref_end_yr = string(default="") | ||
# Final year (i.e., the last available year) for the reference data | ||
# Required for run_type="model_vs_model" "enso_diags"/"streamflow"/"tc_analysis" runs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks copied from the e3sm_diags
task. Can we cut down on the number of parameters needed here, or does PCMDI metrics use them all?
climatology_process_method = string(default="nco") | ||
# Regridding by pcmdi (default is to regrid data to 2.5x2.5 grid for diagnostic metrics) | ||
# Required for mean climate | ||
# OPTIONS: '2.5x2.5' or an actual cdms2 grid object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I don't think you can pass an actual object in through the cfg. Everything gets passed in as a string
-
My understanding is we should have absolutely no more CDAT dependencies anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@forsyth2 : The command here passed a string rather than a command object to python. I think that we should have some online discussions for this.
@@ -0,0 +1,1612 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is pretty long and difficult to comprehend as someone who is unfamiliar with PCMDI. I think what would help me is to have some cfg
s that demonstrate these different pieces so I can more easily put the backend and the generated visual together. See my notes on making min_case
cfgs above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is something that I would like to discuss with you and Chenzhu online.
--realm \ | ||
lnd \ | ||
{% endif -%} | ||
{% if input_files.split(".")[0] == 'cam' or input_files.split(".")[0] == 'eam' -%} | ||
--var-list \ | ||
'pr, tas, rsds, rlds, rsus' \ | ||
'ua, va, ta, wa, zg, hur, pr, prc, prsn, ts, tas, prw, psl, sfcWind, tasmax, tasmin, tauu, tauv, rtmt, rsdt, rsds, rsdscs, rlds, rldscs, rsus, rsuscs, rsut, rsutcs, rlus, rlut, rlutcs, clivi, clwvi, clt, evspsbl, hfls, hfss, huss' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we had some concern about explicitly stating variables in the .bash
files rather than using the cfg
to pass along variables. I see here however that we already had some hard-coded variables...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is under-determined. I was trying not to break the original workflow for ts.bash, and it is better to discuss with you and Jill to come up a good solution or code changes to make the workflow more flexible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@forsyth2 We talked about to have e3sm_to_cmip
as a individual task, rather than embed it in ts
, which will allow easier error handling, and expose more options to users to control. At this point, the hard-codded variables are all variables that e3sm_to_cmip
support, if input for any variables are not available then they get skipped
echo | ||
|
||
# Create top-level directory | ||
web_dir=${www}/${case}/pcmdi_diags #/{{ sub }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had discussed, for production rather than prototyping, that we'd only put images on the www
path, and keep everything else on the output
path. This looks like it's copying everything over to the web server, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@forsyth2 : Yes, I have currently copied all output from PCMDI to the www path as I need to talk with you and Chengzhu to figure out the right sets of data that should be put in the www path. Currently, I followed the E3SM_diags, as some of the diagnostic data from E3SM_diags will also be copied to www directory.
@zhangshixuan1987 and @forsyth2 I will begin reviewing this PR later next week. |
Hi @forsyth2 and @chengzhuzhang : I am not sure if it is properate, but I want to share the sample .cfg files and the run output examples I created for the PCMDI here:
In addition, I also tested the workflow with AMIP and piControl simulations, which can work reasonably as well. I need to further work on following things:
|
srun -N 1 e3sm_to_cmip \ | ||
--output-path \ | ||
${dest_cmip}/${tmp_dir} \ | ||
{% if input_files.split(".")[0] == 'clm2' or input_files.split(".")[0] == 'elm' -%} | ||
--var-list \ | ||
'mrsos, mrso, mrfso, mrros, mrro, prveg, evspsblveg, evspsblsoi, tran, tsl, lai, cLitter, cProduct, cSoilFast, cSoilMedium, cSoilSlow, fFire, fHarvest, cVeg, nbp, gpp, ra, rh' \ | ||
'snd, mrsos, mrso, mrfso, mrros, mrro, prveg, evspsblveg, evspsblsoi, tran, tsl, lai, cLitter, cProduct, cSoilFast, cSoilMedium, cSoilSlow, fFire, fHarvest, cVeg, nbp, gpp, ra, rh' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snd
doesn't seem like a variable e3sm_to_cmip
support, maybe a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snd is the snow depths defined in the CMIP6 model, but I am not sure if this variable is available at e3sm_to_cmip. By the way, this is a land model variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhangshixuan1987 I just had chance to inspect the PR. It does seem like a call will work effectively to discussion design choice. I will schedule some time later next week. At the mean time, could you share a configuration file that I can try running by myself before the discussion.
@chengzhuzhang : Hi Jill, if you check my comments #624 (comment). You will find the e3sm diag cfg file that I shared for you and Ryan to test. please take a look and let me know if you have any questions. |
When reviewing the results, in the parallel coordinate figure: example. The When going over the results, I agree strongly that it would be much easier to navigate the results with a viewer page! @mingxuanwupnnl |
The I think it is due to the constraint in setup: Lines 20 to 21 in 408e9ad
By adding "py" to the extensions list, the installation works. |
Hi @chengzhuzhang : I did not notice this message as you may @ the wrong people. As for your questions here:
|
* Update mypy python version * Update conda python version
* Update pre-commit dependencies * Update pre-commit deps in `conda/dev.yml` * Fix `black` formatting errors * Fix `flake8` programming errors
I accidentally tagged @mingxuanwupnnl, Sorry! @zhangshixuan1987 sounds good, I will try test the new PR for now. |
Issue resolution
Select one: This pull request is...
1. Does this do what we want it to do?
Objectives:
Required:
If applicable:
2. Are the implementation details accurate & efficient?
Required:
If applicable:
zppy/conda
, not just animport
statement.3. Is this well documented?
Required:
4. Is this code clean?
Required:
If applicable: