-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config.resources specifies 40 threads for enkf.x on Hera #1084
Comments
@RussTreadon-NOAA I have eupd resource updates coming in via PR #1070. The PR is in final review. This PR is mainly for WCOSS2 resources but touches some of the R&D ones as well. Please see these resources: These ^ resources result in the following xml settings on Hera: C384C192L127: The C384C192L127 value seems kinda high still, although it's down from the 100 nodes that were set previously. On Orion the C384C192L127 eupd job runs with fewer nodes ( What are your thoughts on the eupd values coming in via PR #1070? |
@KateFriedman-NOAA , thanks for the update. I can not comment on recommended resource settings on Hera without running test cases at various resolutions. As noted above I am currently running eupd for
on Hera with Since PR #1070, in part, addresses concerns of this issue, I am closing this issue. |
Noted, thanks @RussTreadon-NOAA ! There is definitely more refinement/optimization to happen with resources on all machines so I will keep these values for C96C48L127 in mind. We currently group C192, C96, and C48 together with the same resources but given your information we can break them and reduce the values for C96 and C48 in future PRs. |
Expected behavior
eupd successfully runs
enkf.x
on Hera with less than 40 nodes.Current behavior
config.resources
includes${machine} = "HERA"
blocks which set the number of threads to 40,nth_eupd=40
As a result the eupd job runsenkf.x
with many more nodes than is necessary on HeraMachines affected
Hera
To Reproduce
To see this behavior
$PSLOT
atCASE=C96
$PSLOT.xml
$PSLOT.xml
in an editorgdaseupd
section. You will seeeupd will be run on 40 nodes, 1 task per node, 40 threads per task.
Context
C96L127 eupd does not require 40 nodes to run
enkf.x
on Hera.enkf.x
can be run on two nodes at this resolution.Detailed Description
The
eupd
section ofconfig.resources
contains${machine} = "HERA"
blocks which specify thatenkf.x
be run with 40 threads,nth_eupd=40
.It is not clear why
nth_eupd=40
threads are specified forenkf.x
on Hera. This results in eupd requesting many more nodes than are necessary to runenkf.x
on Hera.Possible Implementation
We should consider reducing the Hera thread count for eupd to be consistent with other machines. Of course, doing so requires testing at various
$CASE
on Hera to ensure no adverse impacts.For my C96L127 parallel on Hera,
config.resources
hasThis results in
$PSLOT.xml
requesting 2 nodes to run eupdThe text was updated successfully, but these errors were encountered: