-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Updated modulefiles for Cheyenne, Hera, Jet, Gaea, Orion #419
Conversation
Update the modulefile `build_cheyenne_gnu` and use module paths for EPIC-managed hpc-stack, updated miniconda3 with the regional_workflow (running+plotting), and rocoto
Update the modulefile `build_cheyenne_intel` and use module paths for EPIC-managed hpc-stack, updated miniconda3 with the regional_workflow (running+plotting), and rocoto
Update the modulefile `wflow_cheyenne` and use module paths for EPIC-managed miniconda3 with the regional_workflow (running+plotting) and rocoto
Update the modulefile `build_hera_intel` and use module paths for EPIC-managed hpc-stack, updated miniconda3 with the regional_workflow (running+plotting)
Update the modulefile `wflow_hera` and use module paths for EPIC-managed hpc-stack, updated miniconda3 with the regional_workflow (running+plotting)
Update the modulefile `build_jet_intel` and use module paths for EPIC-managed hpc-stack, updated miniconda3 with the regional_workflow (running+plotting)
Update the modulefile `wflow_jet` and use module path for the EPIC-managed and updated miniconda3 with the regional_workflow (running+plotting)
Update the modulefile `build_orion_intel` and use module paths for the EPIC-managed hpc-stack and updated miniconda3 with the regional_workflow (running+plotting)
Update the modulefile `flow_orion`, use a module path for EPIC-managed miniconda3 with the regional_workflow (running+plotting)
Met and metplus installed as modules in the recent hpc-stack, update installation paths
Met and metplus installed as modules in the recent hpc-stack, update installation paths
Met and metplus installed as modules in the recent hpc-stack, update installation paths
@natalie-perlin - Sorry about that, I meant the paths to the hpc-stacks on Gaea, Orion, and Jet, not the updated hpc-stacks that you have created for this PR:
These three stack locations would need to be updated before we can try and move forward with using them for the SRW. |
@MichaelLueken - these stack locations were not built using |
@natalie-perlin Yes, you are correct that only Jong can update the modules in the paths that I noted above and in issue #409. This was just pointing out that we need to sync the hpc-stack used for both SRW and the weather model. It sounds like you might need to add some updated versions for the weather model to use the epic.role stack locations, while Jong would need to add several modules to his personal stack locations before the SRW would be able to use his personal stack locations. Again, the primary goal is to use a single stack for both the SRW and weather model. You have completed the necessary work to use the epic.role version for the SRW, but these locations will also need to work for the weather model, with a PR opened in the weather model repository updating the location of the stack. |
It was tested on Hera and works fine. |
e2e tests not yet passing on Jet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natalie-perlin Would it be possible to roll the compiler and mpi versions back to what is currently being used on Cheyenne, Hera, and Jet? While it is nice to see that the SRW app builds and runs the workflow using these updated compilers and mpi versions, none of our WE2E tests check for reproducibility, among other aspects.
As is, the ufs-weather-model regression tests will need to be run to see if these updated compilers and mpi versions affect results. This lies outside the scope of the ufs-srweather-app and is one of the reason why a similar PR needs to opened for the ufs-weather-model repository (the other being to ensure that both the SRW app and weather model use the same EPIC maintained stacks).
module load gnu/11.2.0 | ||
module load gnu/12.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was the compiler version updated to the latest on Cheyenne? Would it be possible to bring this back down to gnu/11.2.0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attempt was to use the latest available compiler. The same libraries for the gnu/11.2.0 will be built wehn cheyenne comes back from the maintenance, if this is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @natalie-perlin. Yes, please built the same libraries for gnu/11.2.0
on Cheyenne once it is back up.
modulefiles/build_hera_intel
Outdated
module load hpc-intel/2022.1.2 | ||
module load hpc-impi/2022.1.2 | ||
module load hpc-intel/2022.2.0 | ||
module load hpc-impi/2022.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were the compiler and mpi versions updated to the latest on Hera? Would it be possible to bring these back down to intel/2022.1.2
and impi/2022.1.2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attempt was to use the latest available compiler. The same libraries for the intel/2021.1.2 are being built and the PR will be updates asap, if this intel/2021.1.2 is preferred (tested with the WM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @natalie-perlin! Yes, please build the same libraries using intel/2022.1.2
on Hera.
modulefiles/build_jet_intel
Outdated
module load hpc-intel/2022.1.2 | ||
module load hpc-impi/2022.1.2 | ||
module load hpc-intel/2022.2.0 | ||
module load hpc-impi/2022.2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were the compiler and mpi versions updated to the latest on Jet? Would it be possible to bring these back down to intel/2022.1.2
and impi/2022.1.2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this is not a problem. Will prepare it for the earlier version of intel/2022.1.2. The idea was to update it to the most recent compiler/mpi combination
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @natalie-perlin!
The hpc libraries and modules built with intel/2022.1.2 compiler and impi/2022.1.2 have been prepared for Hera and Jet. |
reversing back to use previous compiler and impi 2022.1.2 from 2022.2.0
reversing to a verified compiler/impi version 2022.1.2 from 2022.2.0
Fixed a typo
e2e tests verified on Jet. thanks. |
Modulefiles in SRW were updated to use EPIC-managed miniconda3/4.12.0 for the workflow and running tasks. The |
@natalie-perlin With the merge of PR #549 (at d036849), the ufs-srweather-app is now using your EPIC-maintained HPC-Stack locations on all machines. I am now closing this PR. If you feel that the PR should be reopened, please let me know. |
DESCRIPTION OF CHANGES:
Updated modulefiles for the following HPC systems: Cheyenne, Gaea, Hera, Orion, Jet, in particular, for the SRW 2.1 release. The changes include new paths for HPC-stack and miniconda3 installations in EPIC-managed common space. Met and Metplus verification packages were installed as part of the hpc-stack. Rocoto has also been installed on Cheyenne in common EPIC-manages space. Met and Metplus modules are not loaded explicitly as modules, but the installation paths are specified in ./ush/machine/<platform>.yaml files. The packages needed for plotting routines are included in the regional_workflow conda virtual environment.
The following types of modulefiles were updated:
./modulefiles/build_<platform>
./modulefiles/wflow<platform>
./modulefiles/tasks/<platform>/miniconda_regional_workflow
./modulefiles/tasks/<platform>/make_grid.local, make_ics.local, make_lbcs.local, make_orog.local, get_extrn_lbcs.local, get_extrn_ics.local, run_fcst.local, run_vx.local
./ush/machine/<platform>.yaml
where <platform> is Cheyenne, Jet, Hera, Orion, Gaea,
and compiler either <intel> or <gnu>; Cheyenne has both compilers
A list of the recent hpc-stack builds updated recently for the current PR (plus additional compiler-mpi combinations for Hera) could be found in the following open issue in the UFS-WM/Issues-1465:.
Type of change
TESTS CONDUCTED:
DEPENDENCIES:
Documentation needs to be updated, including the Chapter on making plots.
ISSUE:
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):
@BruceKropp-Raytheon (Jenkinks jobs building the SRW and running the tests)
@EdwardSnyder-NOAA (tests on Gaea and Hera, met/metplus verification)