Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module and threading updates for Theta #1817

Merged
merged 6 commits into from
Oct 6, 2017
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions cime/config/acme/allactive/config_pesall.xml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,41 @@
</rootpe>
</pes>
</mach>
<mach name="theta">
<pes compset="any" pesize="any">
<comment>none</comment>
<ntasks>
<ntasks_atm>64</ntasks_atm>
<ntasks_lnd>64</ntasks_lnd>
<ntasks_rof>64</ntasks_rof>
<ntasks_ice>64</ntasks_ice>
<ntasks_ocn>64</ntasks_ocn>
<ntasks_glc>64</ntasks_glc>
<ntasks_wav>64</ntasks_wav>
<ntasks_cpl>64</ntasks_cpl>
</ntasks>
<nthrds>
<nthrds_atm>1</nthrds_atm>
<nthrds_lnd>1</nthrds_lnd>
<nthrds_rof>1</nthrds_rof>
<nthrds_ice>1</nthrds_ice>
<nthrds_ocn>1</nthrds_ocn>
<nthrds_glc>1</nthrds_glc>
<nthrds_wav>1</nthrds_wav>
<nthrds_cpl>1</nthrds_cpl>
</nthrds>
<rootpe>
<rootpe_atm>0</rootpe_atm>
<rootpe_lnd>0</rootpe_lnd>
<rootpe_rof>0</rootpe_rof>
<rootpe_ice>0</rootpe_ice>
<rootpe_ocn>0</rootpe_ocn>
<rootpe_glc>0</rootpe_glc>
<rootpe_wav>0</rootpe_wav>
<rootpe_cpl>0</rootpe_cpl>
</rootpe>
</pes>
</mach>
</grid>
<grid name="any">
<mach name="lawrencium-lr2|lawrencium-lr3">
Expand Down
5 changes: 0 additions & 5 deletions cime/config/acme/machines/Depends.corip1

This file was deleted.

2 changes: 1 addition & 1 deletion cime/config/acme/machines/Depends.intel
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ ifeq ($(DEBUG),FALSE)
$(REDUCED_PRECISION_OBJS): %.o: %.F90
$(FC) -c $(INCLDIR) $(INCS) $(FFLAGS) $(FREEFLAGS) -fimf-precision=low -fp-model fast $<
$(SHR_RANDNUM_FORT_OBJS): %.o: %.F90
$(FC) -c $(INCLDIR) $(INCS) $(FFLAGS) $(FREEFLAGS) -O3 -fp-model fast -no-prec-div -no-prec-sqrt -override-limits $<
$(FC) -c $(INCLDIR) $(INCS) $(FFLAGS) $(FREEFLAGS) -O3 -fp-model fast -no-prec-div -no-prec-sqrt -qoverride-limits $<
$(SHR_RANDNUM_C_OBJS): %.o: %.c
$(CC) -c $(INCLDIR) $(INCS) $(CFLAGS) -O3 -fp-model fast $<
endif
28 changes: 0 additions & 28 deletions cime/config/acme/machines/Depends.intel14

This file was deleted.

6 changes: 0 additions & 6 deletions cime/config/acme/machines/Depends.intelmic

This file was deleted.

6 changes: 0 additions & 6 deletions cime/config/acme/machines/Depends.intelmic14

This file was deleted.

2 changes: 1 addition & 1 deletion cime/config/acme/machines/config_batch.xml
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,7 @@

<batch_system MACH="theta" type="cobalt_theta">
<queues>
<queue walltimemax="01:00:00" jobmin="512" jobmax="231936" default="true">default</queue>
<queue walltimemin="00:30:00" walltimemax="02:00:00" jobmin="512" jobmax="231936" default="true">default</queue>
<queue walltimemax="01:00:00" jobmin="1" jobmax="1024" strict="true">debug-cache-quad</queue>
</queues>
</batch_system>
Expand Down
5 changes: 5 additions & 0 deletions cime/config/acme/machines/config_compilers.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1120,6 +1120,11 @@ for mct, etc.
<ADD_CPPDEFS> -DARCH_MIC_KNL </ADD_CPPDEFS>
</compiler>

<compiler COMPILER="gnu" MACH="theta">
<ADD_SLIBS>$(shell nf-config --flibs)</ADD_SLIBS>
<CONFIG_ARGS> --host=Linux </CONFIG_ARGS>
</compiler>

<compiler COMPILER="pgi" MACH="blues">
<PNETCDF_PATH>$(PNETCDFROOT)</PNETCDF_PATH>
<NETCDF_PATH>$(NETCDFROOT)</NETCDF_PATH>
Expand Down
32 changes: 15 additions & 17 deletions cime/config/acme/machines/config_machines.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1504,7 +1504,7 @@
<BATCH_SYSTEM>cobalt_theta</BATCH_SYSTEM>
<SUPPORTED_BY>acme</SUPPORTED_BY>
<GMAKE_J>8</GMAKE_J>
<MAX_TASKS_PER_NODE>64</MAX_TASKS_PER_NODE>
<MAX_TASKS_PER_NODE>128</MAX_TASKS_PER_NODE>
<PES_PER_NODE>64</PES_PER_NODE>
<PROJECT>OceanClimate</PROJECT>
<PROJECT_REQUIRED>TRUE</PROJECT_REQUIRED>
Expand All @@ -1515,11 +1515,16 @@
<arguments>
<arg name="num_tasks" >-n $TOTALPES</arg>
<arg name="tasks_per_node" >-N $PES_PER_NODE</arg>
<arg name="thread_count">--cc depth -d $OMP_NUM_THREADS</arg>
<arg name="env_omp_stacksize">-e OMP_STACKSIZE=128M</arg>
<arg name="env_thread_count">-e OMP_NUM_THREADS=$OMP_NUM_THREADS</arg>
<arg name="smp_vars">$ENV{SMP_VARS}</arg>
<!--arg name="mpi_env" DEBUG="TRUE">-e MPICH_VERSION_DISPLAY=1 -e MPICH_ENV_DISPLAY=1 -e MPICH_CPUMASK_DISPLAY=1</arg-->
</arguments>
</mpirun>
<environment_variables>
<env name="SMP_VARS"></env>
</environment_variables>
<environment_variables SMP_PRESENT="TRUE">
<env name="SMP_VARS">--cc depth -d $ENV{OMP_NUM_THREADS} -j $ENV{OMP_NUM_THREADS} -e OMP_NUM_THREADS=$ENV{OMP_NUM_THREADS} -e OMP_STACKSIZE=128M -e OMP_PROC_BIND=spread -e OMP_PLACES=threads</env>
</environment_variables>
<module_system type="module">
<init_path lang="perl">/opt/modules/default/init/perl.pm</init_path>
<init_path lang="python">/opt/modules/default/init/python.py</init_path>
Expand All @@ -1546,26 +1551,26 @@
<command name="rm">craype</command>
</modules>
<modules>
<command name="load">craype/2.5.11</command>
<command name="load">craype/2.5.12</command>
</modules>
<modules compiler="intel">
<command name="load">intel/18.0.0.128</command>
<command name="load">PrgEnv-intel/6.0.4</command>
<command name="load">intel/17.0.4.196</command>
</modules>
<modules compiler="cray">
<command name="load">cce/8.6.2</command>
<command name="load">PrgEnv-cray/6.0.4</command>
<command name="load">cce/8.6.0</command>
</modules>
<modules compiler="gnu">
<command name="load">PrgEnv-gnu/6.0.4</command>
<command name="load">gcc/6.3.0</command>
<command name="load">PrgEnv-gnu/6.0.4</command>
</modules>
<modules compiler="!intel">
<command name="switch">cray-libsci/17.06.1</command>
<command name="switch">cray-libsci/17.09.1</command>
</modules>
<modules>
<command name="load">craype-mic-knl</command>
<command name="load">cray-mpich/7.6.0</command>
<command name="load">cray-mpich/7.6.2</command>
</modules>
<modules mpilib="mpt">
<command name="load">cray-netcdf-hdf5parallel/4.4.1.1.3</command>
Expand All @@ -1577,13 +1582,6 @@
<command name="load">cray-netcdf/4.4.1.1.3</command>
</modules>
</module_system>
<environment_variables>
<env name="MPICH_ENV_DISPLAY">1</env>
<env name="MPICH_VERSION_DISPLAY">1</env>
<env name="OMP_STACKSIZE">128M</env>
<env name="OMP_PROC_BIND">spread</env>
<env name="OMP_PLACES">threads</env>
</environment_variables>
</machine>

<machine MACH="sooty">
Expand Down
1 change: 1 addition & 0 deletions cime/scripts/lib/CIME/SystemTests/system_tests_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,7 @@ def _set_active_case(self, case):
Use for tests that have multiple cases
"""
self._case = case
self._case.load_env(reset=True)
self._caseroot = case.get_value("CASEROOT")

def run_indv(self, suffix="base", st_archive=False):
Expand Down
7 changes: 4 additions & 3 deletions cime/scripts/lib/CIME/case.py
Original file line number Diff line number Diff line change
Expand Up @@ -1138,7 +1138,7 @@ def get_mpirun_cmd(self, job="case.run"):
executable, mpi_arg_list = env_mach_specific.get_mpirun(self, mpi_attribs, job=job)

# special case for aprun
if executable is not None and "aprun" in executable:
if executable is not None and "aprun" in executable and "titan" in self.get_value("MACH"):
aprun_args, num_nodes = get_aprun_cmd_for_case(self, run_exe)
expect( (num_nodes + self.spare_nodes) == self.num_nodes, "Not using optimized num nodes")
return executable + aprun_args + " " + run_misc_suffix
Expand Down Expand Up @@ -1171,8 +1171,9 @@ def set_model_version(self, model):
else:
logger.warn("WARNING: No %s Model version found."%(model))

def load_env(self):
if not self._is_env_loaded:
def load_env(self, reset=False):
if not self._is_env_loaded or reset:
os.environ["OMP_NUM_THREADS"] = str(self.thread_count)
Copy link
Member Author

@amametjanov amametjanov Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgfouca OMP_NUM_THREADS is not being properly reset when switching between runs of tests with multiple case-dirs. For example, with PET test OMP_NUM_THREADS remains at 1 at first run's case.submit.

Also, this moves the 4-line block's assignment at https://github.com/ACME-Climate/ACME/blob/master/cime/scripts/lib/CIME/case_run.py#L86 to happen earlier (so that it can be captured in logs/run_environment.txt).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

env_module = self.get_env("mach_specific")
env_module.load_env(self)
self._is_env_loaded = True
Expand Down
3 changes: 3 additions & 0 deletions components/mpas-cice/cime_config/buildlib
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ if ( $MACH eq "edison" ) {
} elsif ( $MACH eq "titan" ) {
$bldcmd .= ' TOOL_TARGET_ARCH="-target-cpu=istanbul"';
}
if (defined $ENV{TOOL_DIR}) {
$bldcmd .= " TOOL_DIR=$ENV{TOOL_DIR}";
}

system($bldcmd) == 0 or die "ERROR: $component.buildlib $bldcmd failed: $?\n";

Expand Down
3 changes: 3 additions & 0 deletions components/mpas-o/cime_config/buildlib
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@ if ( $MACH eq "edison" ) {
} elsif ( $MACH eq "titan" ) {
$bldcmd .= ' TOOL_TARGET_ARCH="-target-cpu=istanbul"';
}
if (defined $ENV{TOOL_DIR}) {
$bldcmd .= " TOOL_DIR=$ENV{TOOL_DIR}";
}

system($bldcmd) == 0 or die "ERROR: $component.buildlib $bldcmd failed: $?\n";

Expand Down
3 changes: 3 additions & 0 deletions components/mpasli/cime_config/buildlib
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ if ( $MACH eq "edison" ) {
} elsif ( $MACH eq "titan" ) {
$bldcmd .= ' TOOL_TARGET_ARCH="-target-cpu=istanbul"';
}
if (defined $ENV{TOOL_DIR}) {
$bldcmd .= " TOOL_DIR=$ENV{TOOL_DIR}";
}

# Check for Albany build
if ( $MPASLI_USE_ALBANY eq "TRUE" ) {
Expand Down