Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document MPI/OpenMP/nodes distribution done in cice.batch.csh #650

Closed
phil-blain opened this issue Oct 28, 2021 · 5 comments · Fixed by #750
Closed

Document MPI/OpenMP/nodes distribution done in cice.batch.csh #650

phil-blain opened this issue Oct 28, 2021 · 5 comments · Fixed by #750

Comments

@phil-blain
Copy link
Member

I often have to re-read the computations at the top of cice.batch.csh and do a numeric example to understand what exactly is being done here:

set ntasks = ${ICE_NTASKS}
set nthrds = ${ICE_NTHRDS}
set maxtpn = ${ICE_MACHINE_TPNODE}
set acct = ${ICE_ACCOUNT}
@ ncores = ${ntasks} * ${nthrds}
@ taskpernode = ${maxtpn} / $nthrds
if (${taskpernode} == 0) set taskpernode = 1
@ nnodes = ${ntasks} / ${taskpernode}
if (${nnodes} * ${taskpernode} < ${ntasks}) @ nnodes = $nnodes + 1
set taskpernodelimit = ${taskpernode}
if (${taskpernodelimit} > ${ntasks}) set taskpernodelimit = ${ntasks}
@ corespernode = ${taskpernodelimit} * ${nthrds}

It would be nice if those computations could be summarized (and maybe written in a more mathematical form) somewhere in our documentation.

Tangentially, these computation are copy-pasted in cice.launch.csh, so we should create a third script (cice.distrib.csh ?) and source it from cice.batch.csh and cice.launch.csh to reduce code duplication.

@apcraig
Copy link
Contributor

apcraig commented Nov 3, 2021

You're right about this. My only concern in creating a shared script is that these computations tend to be pretty machine dependent, so while there are some basic things that are shared, there are other machine dependent parts that are not easily shared. If we unify too much, we changes to get one machine working breaking other machines. But we should unify where we can. I'll take a look.

@apcraig apcraig self-assigned this Nov 3, 2021
@phil-blain
Copy link
Member Author

Well, the computations are exactly the same in cice.launch.csh and cice.batch.csh, that's why I was suggesting unifying these in a new script :)

apcraig added a commit to apcraig/CICE that referenced this issue Aug 10, 2022
Update cice.batch.csh and cice.launch.csh to use setup_machparams.csh
See CICE-Consortium#650
apcraig added a commit that referenced this issue Aug 15, 2022
* Update/improve debug_blocks output, see #718.

* Add ICE_MEMUSE cice.settings flag for batch memory use
Add set_env.memsmall, memmed, memlarge options
To use, will require changes to the env machine files.  Most machines will probably not use it.
See #674.

* Add setup_machparams.csh to compute batch/launch machine parameters
Update cice.batch.csh and cice.launch.csh to use setup_machparams.csh
See #650

* Update subroutine diagnostic_abort which calls print_state
Update ice_transport_remap and ice_transport_driver to call diagnostic_abort
  during some errors.
See also #622

* Update miniconda install information
See #547

* Code cleanup based on compile with -Wall
Code cleanup based on -std f2003 and f2008 checks
Add -stand f08 to cheyenne_intel debug flags
Add -std f2008 to cheyenne_gnu debug flags
Code consistent with Fortran 2003 except for use of contiguous in
  1d evp code.

* Remove all trailing blank space with script

* Update the cheyenne env so qc testing works
Add configuration/scripts/tests/qctest.yml file
Update documentation

* Update Icepack

* Clean up some output

* fix comments

* update print_state output
@phil-blain
Copy link
Member Author

Hi Tony, I'm not sure we should have closed this one as one of my points was to document in the user guide what computations were being done by that script.

@phil-blain phil-blain reopened this Aug 16, 2022
@apcraig
Copy link
Contributor

apcraig commented Aug 16, 2022

I have mixed feelings about doing a bunch of documentation. In part, these calculations are dependent on the syntax needed for each machine. There are a lot of formats and a number of different "parameters" are used. As new machines are added, new parameters may be needed. The syntax for batch and launch are often quite different, which is largely why I thought the calculations should be local to each script. But there is also a lot of overlap. I also understand it's a little bit of a pain to decipher the calculations everytime a port is added/updated, but I also think that's just part of the process of porting. If we want to increase the documentation, I think it should be done as comments where the calculations are done, and I tried to do a bit of that as part of this update.

@phil-blain
Copy link
Member Author

OK, I understand. Good idea to add more comments. I'll reclose this one then. Thanks :)

dabail10 pushed a commit to ESCOMP/CICE that referenced this issue Oct 4, 2022
* Update/improve debug_blocks output, see CICE-Consortium#718.

* Add ICE_MEMUSE cice.settings flag for batch memory use
Add set_env.memsmall, memmed, memlarge options
To use, will require changes to the env machine files.  Most machines will probably not use it.
See CICE-Consortium#674.

* Add setup_machparams.csh to compute batch/launch machine parameters
Update cice.batch.csh and cice.launch.csh to use setup_machparams.csh
See CICE-Consortium#650

* Update subroutine diagnostic_abort which calls print_state
Update ice_transport_remap and ice_transport_driver to call diagnostic_abort
  during some errors.
See also CICE-Consortium#622

* Update miniconda install information
See CICE-Consortium#547

* Code cleanup based on compile with -Wall
Code cleanup based on -std f2003 and f2008 checks
Add -stand f08 to cheyenne_intel debug flags
Add -std f2008 to cheyenne_gnu debug flags
Code consistent with Fortran 2003 except for use of contiguous in
  1d evp code.

* Remove all trailing blank space with script

* Update the cheyenne env so qc testing works
Add configuration/scripts/tests/qctest.yml file
Update documentation

* Update Icepack

* Clean up some output

* fix comments

* update print_state output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants