Skip to content

Commit

Permalink
Port to S4 (#1023)
Browse files Browse the repository at this point in the history
Ports the global workflow to the S4 cluster.  Note that this does not include support for S2S experiments.  Additionally, S4 does not support C768 experiments.

A couple of special notes:

- S4 does not have access to rstprod data.  Among other things, this means that the nsstbufr and prepbufr.acft_profiles files must be created on the fly.  The way I accomplished this was by moving the `MAKE_NSSTBUFR` to and creating `MAKE_ACFTBUFR` in config.base.emc.dyn and setting them via setup_workflow.xml.  This seems like a bit of a kludge and I am open to suggestions on how else to address this.  Both options need to be set for the prep and analysis jobs.
- S4 _can_ run S2S+ experiments, but this requires significant, and convoluted, modifications to the configuration files.  Support for these are thus not enabled by default.  Instead, I have placed a set of configuration files in S4:/data/users/dhuber/save/s2s_configs. Users interested in performing these experiments should contact me to set it up.

Closes #138
  • Loading branch information
DavidHuber-NOAA authored Oct 8, 2022
1 parent b26a8ac commit e09989b
Show file tree
Hide file tree
Showing 28 changed files with 756 additions and 206 deletions.
2 changes: 1 addition & 1 deletion Externals.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ protocol = git
required = True

[gfs-utils]
hash = 3a609ea
hash = 7bf599f
local_path = sorc/gfs_utils.fd
repo_url = https://github.com/NOAA-EMC/gfs-utils
protocol = git
Expand Down
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,14 @@ The global-workflow depends on the following prerequisities to be available on t
* workflow manager - ROCOTO (https://github.com/christopherwharrop/rocoto)
* modules - NCEPLIBS (various), esmf v8.0.0bs48, hdf5, intel/ips v18, impi v18, wgrib2, netcdf v4.7.0, hpss, gempak (see module files under /modulefiles for additional details)

The global-workflow current supports the following machines:
The global-workflow current supports the following tier-1 machines:

* WCOSS-Dell
* WCOSS-Cray
* Hera
* Orion

Additionally, the following tier-2 machine is supported:
* S4 (Note that S2S+ experiments are not fully supported)

Quick-start instructions are below. Full instructions are available in the [wiki](https://github.com/NOAA-EMC/global-workflow/wiki/Run-Global-Workflow)

## Build global-workflow:
Expand Down
295 changes: 295 additions & 0 deletions env/S4.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
#!/bin/bash -x

if [ $# -ne 1 ]; then

echo "Must specify an input argument to set runtime environment variables!"
echo "argument can be any one of the following:"
echo "atmanalrun atmensanalrun"
echo "anal sfcanl fcst post vrfy metp"
echo "eobs eupd ecen efcs epos"
echo "postsnd awips gempak"
exit 1

fi

step=$1
PARTITION_BATCH=${PARTITION_BATCH:-"s4"}

if [[ ${PARTITION_BATCH} = "s4" ]]; then
export npe_node_max=32
elif [[ ${PARTITION_BATCH} = "ivy" ]]; then
export npe_node_max=20
fi
export launcher="srun -l --export=ALL"

# Configure MPI environment
export OMP_STACKSIZE=2048000
export NTHSTACK=1024000000

ulimit -s unlimited
ulimit -a

export job=${PBS_JOBNAME:-${step}}
export jobid=${job}.${PBS_JOBID:-$$}


if [[ ${step} = "prep" || ${step} = "prepbufr" ]]; then

npe_node_prep=${npe_node_prep:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_prep))

export POE="NO"
export BACK="NO"
export sys_tp="S4"
export launcher_PREP="srun"

elif [[ ${step} = "waveinit" || ${step} = "waveprep" || ${step} = "wavepostsbs" || ${step} = "wavepostbndpnt" || ${step} = "wavepostbndpntbll" || ${step} = "wavepostpnt" ]]; then

export mpmd="--multi-prog"
export CFP_MP="YES"
if [[ ${step} = "waveprep" ]]; then export MP_PULSE=0 ; fi
export wavempexec=${launcher}
export wave_mpmd=${mpmd}

elif [[ ${step} = "atmanalrun" ]]; then

export CFP_MP=${CFP_MP:-"YES"}
export USE_CFP=${USE_CFP:-"YES"}
export APRUNCFP="${launcher} -n \$ncmd --multi-prog"

npe_node_atmanalrun=${npe_node_atmanalrun:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_atmanalrun))

export NTHREADS_ATMANAL=${nth_atmanalrun:-${nth_max}}
[[ ${NTHREADS_ATMANAL} -gt ${nth_max} ]] && export NTHREADS_ATMANAL=${nth_max}
export APRUN_ATMANAL="${launcher} -n ${npe_atmanalrun:-0}"

elif [[ ${step} = "atmensanalrun" ]]; then

export CFP_MP=${CFP_MP:-"YES"}
export USE_CFP=${USE_CFP:-"YES"}
export APRUNCFP="${launcher} -n \$ncmd --multi-prog"

npe_node_atmensanalrun=${npe_node_atmensanalrun:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_atmensanalrun))

export NTHREADS_ATMENSANAL=${nth_atmensanalrun:-${nth_max}}
[[ ${NTHREADS_ATMENSANAL} -gt ${nth_max} ]] && export NTHREADS_ATMENSANAL=${nth_max}
export APRUN_ATMENSANAL="${launcher} -n ${npe_atmensanalrun:-0}"

elif [[ ${step} = "aeroanlrun" ]]; then

export APRUNCFP="${launcher} -n \$ncmd --multi-prog"

npe_node_aeroanlrun=${npe_node_aeroanlrun:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_aeroanlrun))

export NTHREADS_AEROANL=${nth_aeroanlrun:-${nth_max}}
[[ ${NTHREADS_AEROANL} -gt ${nth_max} ]] && export NTHREADS_AEROANL=${nth_max}
export APRUN_AEROANL="${launcher} -n ${npe_aeroanlrun:-0}"

elif [[ ${step} = "anal" || ${step} = "analcalc" ]]; then

export MKL_NUM_THREADS=4
export MKL_CBWR=AUTO

export CFP_MP=${CFP_MP:-"YES"}
export USE_CFP=${USE_CFP:-"YES"}
export APRUNCFP="${launcher} -n \$ncmd --multi-prog"

npe_node_anal=${npe_node_anal:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_anal))

export NTHREADS_GSI=${nth_anal:-${nth_max}}
[[ ${NTHREADS_GSI} -gt ${nth_max} ]] && export NTHREADS_GSI=${nth_max}
export APRUN_GSI=${launcher}

export NTHREADS_CALCINC=${nth_calcinc:-1}
[[ ${NTHREADS_CALCINC} -gt ${nth_max} ]] && export NTHREADS_CALCINC=${nth_max}
export APRUN_CALCINC=${launcher}

export NTHREADS_CYCLE=${nth_cycle:-12}
[[ ${NTHREADS_CYCLE} -gt ${npe_node_max} ]] && export NTHREADS_CYCLE=${npe_node_max}
npe_cycle=${ntiles:-6}
export APRUN_CYCLE="${launcher} -n ${npe_cycle:-0}"


export NTHREADS_GAUSFCANL=1
npe_gausfcanl=${npe_gausfcanl:-1}
export APRUN_GAUSFCANL="${launcher} -n ${npe_gausfcanl:-0}"

elif [[ ${step} = "sfcanl" ]]; then
npe_node_sfcanl=${npe_node_sfcanl:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_sfcanl))

export NTHREADS_CYCLE=${nth_sfcanl:-14}
[[ ${NTHREADS_CYCLE} -gt ${npe_node_max} ]] && export NTHREADS_CYCLE=${npe_node_max}
npe_sfcanl=${ntiles:-6}
export APRUN_CYCLE="${launcher} -n ${npe_sfcanl:-0}"

elif [[ ${step} = "gldas" ]]; then

npe_node_gldas=${npe_node_gldas:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_gldas))

export NTHREADS_GLDAS=${nth_gldas:-${nth_max}}
[[ ${NTHREADS_GLDAS} -gt ${nth_max} ]] && export NTHREADS_GLDAS=${nth_max}
export APRUN_GLDAS="${launcher} -n ${npe_gldas:-0}"

export NTHREADS_GAUSSIAN=${nth_gaussian:-1}
[[ ${NTHREADS_GAUSSIAN} -gt ${nth_max} ]] && export NTHREADS_GAUSSIAN=${nth_max}
export APRUN_GAUSSIAN="${launcher} -n ${npe_gaussian:-0}"

# Must run data processing with exactly the number of tasks as time
# periods being processed.

gldas_spinup_hours=${gldas_spinup_hours:-0}
npe_gldas_data_proc=$((gldas_spinup_hours + 12))
export APRUN_GLDAS_DATA_PROC="${launcher} -n ${npe_gldas_data_proc} --multi-prog"

elif [[ ${step} = "eobs" ]]; then

export MKL_NUM_THREADS=4
export MKL_CBWR=AUTO

export CFP_MP=${CFP_MP:-"YES"}
export USE_CFP=${USE_CFP:-"YES"}
export APRUNCFP="${launcher} -n \$ncmd --multi-prog"

npe_node_eobs=${npe_node_eobs:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_eobs))

export NTHREADS_GSI=${nth_eobs:-${nth_max}}
[[ ${NTHREADS_GSI} -gt ${nth_max} ]] && export NTHREADS_GSI=${nth_max}
export APRUN_GSI=${launcher}

elif [[ ${step} = "eupd" ]]; then

export CFP_MP=${CFP_MP:-"YES"}
export USE_CFP=${USE_CFP:-"YES"}
export APRUNCFP="${launcher} -n \$ncmd --multi-prog"

npe_node_eupd=${npe_node_eupd:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_eupd))

export NTHREADS_ENKF=${nth_eupd:-${nth_max}}
[[ ${NTHREADS_ENKF} -gt ${nth_max} ]] && export NTHREADS_ENKF=${nth_max}
export APRUN_ENKF=${launcher}

elif [[ ${step} = "fcst" ]]; then

#PEs and PEs/node can differ for GFS and GDAS forecasts if threading differs
if [[ ${CDUMP:-gdas} == "gfs" ]]; then
npe_fcst=${npe_fcst_gfs:-0}
npe_node_fcst=${npe_node_fcst_gfs:-${npe_node_max}}
nth_fv3=${nth_fv3_gfs:-1}
fi

npe_node_fcst=${npe_node_fcst:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_fcst))

export NTHREADS_FV3=${nth_fv3:-${nth_max}}
[[ ${NTHREADS_FV3} -gt ${nth_max} ]] && export NTHREADS_FV3=${nth_max}
export cores_per_node=${npe_node_max}
export APRUN_FV3="${launcher} -n ${npe_fcst:-0}"

export NTHREADS_REGRID_NEMSIO=${nth_regrid_nemsio:-1}
[[ ${NTHREADS_REGRID_NEMSIO} -gt ${nth_max} ]] && export NTHREADS_REGRID_NEMSIO=${nth_max}
export APRUN_REGRID_NEMSIO=${launcher}

export NTHREADS_REMAP=${nth_remap:-2}
[[ ${NTHREADS_REMAP} -gt ${nth_max} ]] && export NTHREADS_REMAP=${nth_max}
export APRUN_REMAP=${launcher}
export I_MPI_DAPL_UD="enable"

elif [[ ${step} = "efcs" ]]; then

npe_node_efcs=${npe_node_efcs:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_efcs))

export NTHREADS_FV3=${nth_efcs:-${nth_max}}
[[ ${NTHREADS_FV3} -gt ${nth_max} ]] && export NTHREADS_FV3=${nth_max}
export cores_per_node=${npe_node_max}
export APRUN_FV3="${launcher} -n ${npe_efcs:-0}"

export NTHREADS_REGRID_NEMSIO=${nth_regrid_nemsio:-1}
[[ ${NTHREADS_REGRID_NEMSIO} -gt ${nth_max} ]] && export NTHREADS_REGRID_NEMSIO=${nth_max}
export APRUN_REGRID_NEMSIO="${launcher} ${LEVS:-128}"

elif [[ ${step} = "post" ]]; then

npe_node_post=${npe_node_post:-npe_node_max}
nth_max=$((npe_node_max / npe_node_post))

export NTHREADS_NP=${nth_np:-1}
[[ ${NTHREADS_NP} -gt ${nth_max} ]] && export NTHREADS_NP=${nth_max}
export APRUN_NP=${launcher}

export NTHREADS_DWN=${nth_dwn:-1}
[[ ${NTHREADS_DWN} -gt ${nth_max} ]] && export NTHREADS_DWN=${nth_max}
export APRUN_DWN=${launcher}

elif [[ ${step} = "ecen" ]]; then

npe_node_ecen=${npe_node_ecen:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_ecen))

export NTHREADS_ECEN=${nth_ecen:-${nth_max}}
[[ ${NTHREADS_ECEN} -gt ${nth_max} ]] && export NTHREADS_ECEN=${nth_max}
export APRUN_ECEN=${launcher}

export NTHREADS_CHGRES=${nth_chgres:-12}
[[ ${NTHREADS_CHGRES} -gt ${npe_node_max} ]] && export NTHREADS_CHGRES=${npe_node_max}
export APRUN_CHGRES="time"

export NTHREADS_CALCINC=${nth_calcinc:-1}
[[ ${NTHREADS_CALCINC} -gt ${nth_max} ]] && export NTHREADS_CALCINC=${nth_max}
export APRUN_CALCINC=${launcher}

elif [[ ${step} = "esfc" ]]; then

npe_node_esfc=${npe_node_esfc:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_esfc))

export NTHREADS_ESFC=${nth_esfc:-${nth_max}}
[[ ${NTHREADS_ESFC} -gt ${nth_max} ]] && export NTHREADS_ESFC=${nth_max}
export APRUN_ESFC="${launcher} -n ${npe_esfc:-0}"

export NTHREADS_CYCLE=${nth_cycle:-14}
[[ ${NTHREADS_CYCLE} -gt ${npe_node_max} ]] && export NTHREADS_CYCLE=${npe_node_max}
export APRUN_CYCLE="${launcher} -n ${npe_esfc:-0}"

elif [[ ${step} = "epos" ]]; then

npe_node_epos=${npe_node_epos:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_epos))

export NTHREADS_EPOS=${nth_epos:-${nth_max}}
[[ ${NTHREADS_EPOS} -gt ${nth_max} ]] && export NTHREADS_EPOS=${nth_max}
export APRUN_EPOS=${launcher}

elif [[ ${step} = "init" ]]; then

export APRUN=${launcher}

elif [[ ${step} = "postsnd" ]]; then

npe_node_postsnd=${npe_node_postsnd:-${npe_node_max}}
nth_max=$((npe_node_max / npe_node_postsnd))

export NTHREADS_POSTSND=${nth_postsnd:-1}
[[ ${NTHREADS_POSTSND} -gt ${nth_max} ]] && export NTHREADS_POSTSND=${nth_max}
export APRUN_POSTSND=${launcher}

export NTHREADS_POSTSNDCFP=${nth_postsndcfp:-1}
[[ ${NTHREADS_POSTSNDCFP} -gt ${nth_max} ]] && export NTHREADS_POSTSNDCFP=${nth_max}
export APRUN_POSTSNDCFP=${launcher}

elif [[ ${step} = "awips" ]]; then

echo "WARNING: ${step} is not enabled on S4!"

elif [[ ${step} = "gempak" ]]; then

echo "WARNING: ${step} is not enabled on S4!"
fi
17 changes: 15 additions & 2 deletions jobs/JGDAS_ENKF_DIAG
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ export pgmerr=errfile
export CDATE=${CDATE:-${PDY}${cyc}}
export CDUMP=${CDUMP:-${RUN:-"gdas"}}
export COMPONENT=${COMPONENT:-atmos}
export MAKE_NSSTBUFR=${MAKE_NSSTBUFR:-"NO"}
export MAKE_ACFTBUFR=${MAKE_ACFTBUFR:-"NO"}


##############################################
Expand Down Expand Up @@ -104,9 +106,20 @@ export PREPQC="$COMIN_OBS/${OPREFIX}prepbufr"
if [ ! -f $PREPQC ]; then
echo "WARNING: Global PREPBUFR FILE $PREPQC MISSING"
fi
export PREPQCPF="$COMIN_OBS/${OPREFIX}prepbufr.acft_profiles"
export TCVITL="$COMIN_ANL/${OPREFIX}syndata.tcvitals.tm00"
[[ $DONST = "YES" ]] && export NSSTBF="$COMIN_OBS/${OPREFIX}nsstbufr"
if [[ $DONST = "YES" ]]; then
if [[ $MAKE_NSSTBUFR == "YES" ]]; then
export NSSTBF="${COMOUT}/${OPREFIX}nsstbufr"
else
export NSSTBF="${COMIN_OBS}/${OPREFIX}nsstbufr"
fi
fi

if [[ $MAKE_ACFTBUFR == "YES" ]]; then
export PREPQCPF="${COMOUT}/${OPREFIX}prepbufr.acft_profiles"
else
export PREPQCPF="${COMIN_OBS}/${OPREFIX}prepbufr.acft_profiles"
fi

# Guess Bias correction coefficients related to control
export GBIAS=${COMIN_GES_CTL}/${GPREFIX}abias
Expand Down
17 changes: 15 additions & 2 deletions jobs/JGDAS_ENKF_SELECT_OBS
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ export pgmerr=errfile
export CDATE=${CDATE:-${PDY}${cyc}}
export CDUMP=${CDUMP:-${RUN:-"gdas"}}
export COMPONENT=${COMPONENT:-atmos}
export MAKE_NSSTBUFR=${MAKE_NSSTBUFR:-"NO"}
export MAKE_ACFTBUFR=${MAKE_ACFTBUFR:-"NO"}


##############################################
Expand Down Expand Up @@ -107,9 +109,20 @@ export PREPQC="$COMIN_OBS/${OPREFIX}prepbufr"
if [ ! -f $PREPQC ]; then
echo "WARNING: Global PREPBUFR FILE $PREPQC MISSING"
fi
export PREPQCPF="$COMIN_OBS/${OPREFIX}prepbufr.acft_profiles"
export TCVITL="$COMIN_ANL/${OPREFIX}syndata.tcvitals.tm00"
[[ $DONST = "YES" ]] && export NSSTBF="$COMIN_OBS/${OPREFIX}nsstbufr"
if [[ $DONST = "YES" ]]; then
if [[ $MAKE_NSSTBUFR == "YES" ]]; then
export NSSTBF="${COMOUT}/${OPREFIX}nsstbufr"
else
export NSSTBF="${COMIN_OBS}/${OPREFIX}nsstbufr"
fi
fi

if [[ $MAKE_ACFTBUFR == "YES" ]]; then
export PREPQCPF="${COMOUT}/${OPREFIX}prepbufr.acft_profiles"
else
export PREPQCPF="${COMIN_OBS}/${OPREFIX}prepbufr.acft_profiles"
fi

# Guess Bias correction coefficients related to control
export GBIAS=${COMIN_GES_CTL}/${GPREFIX}abias
Expand Down
Loading

0 comments on commit e09989b

Please sign in to comment.