-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EAMxx: add multi-instance and nbfb testing #6984
Conversation
|
To trigger: generate baselines:
compare to above-generated baselines:
Sample of test output viewable on the web (but not sure if it is meaningful; seems like fields are constants? 🤷 tbd more validation): https://portal.nersc.gov/project/e3sm/mahf708/MVKxx_Ly1.ne4pg2_ne4pg2.F2010-SCREAMv1.pm-cpu_intel.eamxx-mvkpert.C.20250210_142343_isu2io.evv/ |
Adding @jgfouca since he is Mr. CIME, so he may have thoughts on the SystemTests part. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! I think this is our first E3SM-only SystemTest. We should think about moving the other tests that are very E3SM-specific here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Jim should review this, but I have one thought.
I see lots of overlap with MVK. So two sub-thoughts:
- Let's prune what's the same, and only keep the differences
- Wild idea: would it make sense to not add MVKxx, and just hack our buildnml scripts, so that we read what's in user_nl_eamxx, and perform the corresponding atmchanges? It would be taylored for this test type. We could put a check "if it's MVK test, then parse user_nl_eamxx and do atmchanges"... EDIT: I now see that another big diff is that MVKxx calls ksxx, while MVK relies on this evv external library. If this is the easier thing to do, then so be it. But maybe @jgfouca can tell whether we can cast our problem into something that existing MVK can handle?
EDIT 2: I don't want to burden this PR with more work, adding a feature is already a lot. But looking ahead, maybe we could make the infrastructure more general. E.g., I expect that maybe other new components (like omega) may want a MVK-like test, but may fail to fit in the CIME's MVK approach. Maybe we should create an e3sm-owned MVK approach that delegates to the component's buildnml script the job of modifying the different instances inputs, as well as uses some module in the component's path to collect the output files. That is, remove assumptions in the main scripts about how inputs/outputs files are called/structured. Let the components handle that bit of info.
I will explain the PR better before merging. The logic in this PR is copied in super large part of two earlier efforts (one is the vanilla default MVK that's designed for EAM and the other is a specialized case for MPASO, called MVKO). The goal is to hand this off to the infra team and move all of this for better long-term maintainability to this repo (https://github.com/LIVVkit/evv4esm) where the stuff I am overriding with these python files is there. So the goal is to specialize those files there to first-class support EAMxx rather than hack something here. I got access to that repo on Friday, so I (or @mkstratos) could do that once we are happy with the results for EAMxx. There's on minor issue I'd like to address before declaring this ready for review. I need to figure out how to sidestep the lack of conventional short-term archiving capability in EAMxx (a small part of this test relies on it). After that, I will need to run this test for 12+ months twice to see if it behaves as expected. If all goes well, I will clean up this PR into ~3 commits and rebase, and invite Jim, Luca, Rob, Mike, and Peter for a review. (Luca, I'd like you to help me think about the MCT coupling logic and it how it interacts with this multi-instance capability; the one-liner edit in MCT coupling layer does the job for this PR purposes, but maybe it is confusing and annoying to do it like that? If you want, we should start thinking about this in a more organizational aspect --- the multi-instance capability rely on touching the last file of record (in eam world, that is atm_in; in eamxx, it is scream_input.yaml). For eam, the last file of record gets handled automatically the multi-instance mechanics deep down, but in eamxx, it obviously doesn't and need some manual intervention) |
Yeah, I'm perfectly fine doing things in-house to get the capability in. But I would really like a nice conversation in the infra team to avoid the proliferation of tests/scripts that are very similar to each other, and promote instead the asbtraction of key concepts, so that components are left in charge only of what specific to them. As for the shortcoming of mention (the archiving), I am not familiar with that feature. I don't know what I have to do locally, to test what happens, as I have never used it (I am not a model user; not enough, at least). But if you want, we can check together and discuss what needs to change (in eamxx and/or CIME) in order to fix things. |
Yeah, I think MVK is also E3SM-only for all intents. We could move both and encapsulate some of the duplicated code in this directory. |
7882633
to
cfd3659
Compare
@rljacob @bartgol @jgfouca, this is now ready for integration. PTAL. Note that @PeterCaldwell and I are planning to take this for a spin in terms of validation. For now, this uses a small subset of variables (only four, see components/eamxx/cime_config/SystemTests/ksxx_vars.json) just to get things rolling and it hardcodes the number of ensemble members to 2 (2!) only, also to get things rolling. We intend to test and see what sort of settings are scientifically valid for climate reproducibility testing for EAMxx. More to come, but this is the first technical hurdle to just get things going. See #6984 (comment) for a sample run with output. And refer to comments above about planned activities in terms of moving code around. @mkstratos: This is ready if you'd like to take a rigorous look. Note that we will easily move stuff to evv once we are ready, most of the eamxx-specific mods are relatively (so lots of duplicated code for now). Because evv gets released as part of e3sm-unified/cime-latest conda envs, I'd like to keep all this duplication here until we figure out a better alternative. Anyway, I hope this will be my role in getting the software/technical end of this done, and I can go back to the validation aspects now :) |
Might consider changes to the shell_commands. Not sure the goal, but probably don't want 72 vertical levels. |
@rljacob: I'd like this merged soon so that we can evaluate it in the near future. I messaged @mkstratos to review it, and he said he would. There'll be another iteration on the settings (variable names, number of instances, etc.) once the test is scientifically validated. I'll be mostly out for the week (still ill), but hopefully I don't need to be around for it to be merged. Okay with you if @bartgol or @jgfouca integrate it or do you prefer we wait until I am back online fully? |
I'm ok with merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the duplicated EVV code, which we can (re)move later, this looks good to me.
return monthly_avgs | ||
|
||
|
||
def load_mpas_climatology_ensemble(files, field_name, mask_value=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method isn't used below, but this file will go away once we update EVV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will clean up once we reconcile the eam/eamxx files in evv
table_data = pd.DataFrame(details).T | ||
uc_rejections = (table_data["K-S test p-val"] < args.alpha).sum() | ||
_hdrs = [ | ||
"h", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pedantic, but the "h0" here originally refers to null hypothesis acceptance/rejection rather than h0 from the hist files (only changes the text of output tables on the generated webpage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, I did a blind h0->h because of the archive stuff. Will also fix once we reconcile.
@bartgol and/or @jgfouca, could one of you integrate this now? I think it is ready; there will be clean-up activities for two aspects soon:
|
@mahf708 , yes, I can start integrating. |
Ok. Just quickly re-running failed v1 checks, to be sure. Should just be a matter of old baselines. |
Adding support for multi-instance capability in EAMxx (a one-liner in mct coupling layer) and adding a customized/specialized MVK tests for climate/nbfb reproducibility in EAMxx. More details to come. [BFB]
Merged to next. |
Adding support for multi-instance capability in EAMxx (a one-liner in mct coupling layer) and adding a customized/specialized MVK tests for climate/nbfb reproducibility in EAMxx. More details to come.
[BFB]