Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wexac cluster #3613

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions etc/picongpu/wexac-weizmann/gpu.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
#!/usr/bin/env bash
# Copyright 2013-2021 Axel Huebl, Anton Helm, Richard Pausch, Rene Widera,
# Marco Garten
#
# This file is part of PIConGPU.
#
# PIConGPU is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# PIConGPU is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with PIConGPU.
# If not, see <http://www.gnu.org/licenses/>.
#


# PIConGPU batch script for wexacs's BSUB batch system
#BSUB -csm y
#BSUB -q !TBG_queue
#BSUB -m !TBG_gpuType
#BSUB -W !TBG_wallTimeNoSeconds

# sets batch job's name
#BSUB -J !TBG_jobName
#BSUB -n !TBG_tasks
#BSUB -R rusage[mem=48000]
#BSUB !TBG_mailSettings -u !TBG_mailAddress
#BSUB -cwd !TBG_dstPath

# sets output
#BSUB -o stdout.%J
#BSUB -e stderr.%J


## calculations will be performed by tbg
# extract queue and gpu selection
.TBG_queue=${TBG_partition:-"gpu-short"}
.TBG_gpuType=${TBG_gpuType:-"dgx_hosts"}

# remove seconds from walltime
.TBG_wallTimeNoSeconds=${TBG_wallTime::-3}

# settings that can be controlled by environment variables before submit
.TBG_mailSettings=${MY_MAILNOTIFY:-""}
.TBG_mailAddress=${MY_MAIL:-"someone@example.com"}
.TBG_author=${MY_NAME:+--author \"${MY_NAME}\"}
.TBG_profile=${PIC_PROFILE:-"~/picongpu.profile"}

# number of available/hosted GPUs per node in the system
.TBG_numHostedGPUPerNode=8

# required GPUs per node for the current job
.TBG_gpusPerNode=`if [ $TBG_tasks -gt $TBG_numHostedGPUPerNode ] ; then echo $TBG_numHostedGPUPerNode; else echo $TBG_tasks; fi`

# number of cores to block per GPU - we got 7 cpus per gpu
# and we will be accounted 7 CPUs per GPU anyway
.TBG_coresPerGPU=7

# use ceil to caculate nodes
.TBG_nodes="$((( TBG_tasks + TBG_gpusPerNode - 1 ) / TBG_gpusPerNode))"

## end calculations ##

echo 'Running program...'

cd !TBG_dstPath

export MODULES_NO_OUTPUT=1
source !TBG_profile
if [ $? -ne 0 ] ; then
echo "Error: PIConGPU environment profile under \"!TBG_profile\" not found!"
exit 1
fi
unset MODULES_NO_OUTPUT

#set user rights to u=rwx;g=r-x;o=---
umask 0027

mkdir simOutput 2> /dev/null
cd simOutput

export OMP_NUM_THREADS=!TBG_coresPerGPU
mpiexec -n !TBG_tasks !TBG_dstPath/input/bin/picongpu !TBG_author !TBG_programParams | tee output
# note: instead of the PIConGPU binary, one can also debug starting "js_task_info | sort"
73 changes: 73 additions & 0 deletions etc/picongpu/wexac-weizmann/gpu_picongpu.profile.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Name and Path of this Script ############################### (DO NOT change!)
export PIC_PROFILE=$(cd $(dirname $BASH_SOURCE) && pwd)"/"$(basename $BASH_SOURCE)

# User Information ################################# (edit the following lines)
# - automatically add your name and contact to output file meta data
# - send mails on job (-B)egin, Fi(-N)ish
export MY_MAILNOTIFY=""
export MY_MAIL="someone@example.com"
export MY_NAME="$(whoami) <$MY_MAIL>"

# Text Editor for Tools ###################################### (edit this line)
# - examples: "nano", "vim", "emacs -nw", "vi" or without terminal: "gedit"
#export EDITOR="nano"

# General modules #############################################################
#
module load gcc/6.3.0
module load cmake/3.18.4
module load openmpi/2.0.1
module load cuda/9.2
module load boost/1.69.0

export CXX=$(which g++)
export CC=$(which gcc)

# Other Software ##############################################################
#
# not yet as module avaialable

# Environment #################################################################
#
export PICSRC=$HOME/src/picongpu
export PIC_EXAMPLES=$PICSRC/share/picongpu/examples
export PIC_BACKEND="cuda:70"

export PATH=$PATH:$PICSRC
export PATH=$PATH:$PICSRC/bin
export PATH=$PATH:$PICSRC/src/tools/bin

export PYTHONPATH=$PICSRC/lib/python:$PYTHONPATH

# "tbg" default options #######################################################
# currently the submit script, generated by tbg, needs to be streamed to bsub
export TBG_SUBMIT="echo 'manually execute: bsub < '"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does having export TBG_SUBMIT="bsub <" not work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not well familiar with internals of tbg itself, so do not know it.

Copy link
Member Author

@PrometheusPi PrometheusPi May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point 👍 - I have no access to test this. @danlevy100 could you test this? Or @psychocoderHPC could you comment whether this could work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this line should be

export TBG_SUBMIT="bsub"

Maybe the line above is generating a valid example but I do not understand why you would do it instead of executing bsub directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psychocoderHPC We have to do that, because the admins of the wexac cluster, as far as we understood it, prevent using the input file option and only allow to "stream" to bsub.
@danlevy100 Did that configuration change in the mean time?

Copy link

@danlevy100 danlevy100 May 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PrometheusPi @psychocoderHPC No, the configuration did not change.
The only way I could get bsub to submit using a script file is by "bsub < submit.start".
There is some information here: https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=bsub-write-job-scripts

I had to change ~/src/picongpu/tbg in order to be able to submit with tbg, as mentioned in one of my comments in the #3496 thread.

Copy link
Member Author

@PrometheusPi PrometheusPi May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbastrakov According to @danlevy100 post here, "bsub <" should not work. @danlevy100 How did you change tbg? (Could you provide a diff?)

export TBG_TPLFILE="etc/picongpu/wexac-weizmann/gpu.tpl"

# select gpu queue PIConGPU should use default gpu-short):
# options are: (default will be gpu-short)
# | queue | limits |
# |------------|-----------------------|
# | gpu-short | 32 gpu's/6 hours max |
# | gpu-medium | 24 gpu's/12 hours max |
# | gpu-long | 16 gpu's/10 days max |
export TBG_queue="gpu-short"

# select type of gpu PIConGPU should use:
# options are: (default will be V100)
# | GPU type | set |
# |----------|-------------|
# | RTX-2000 | asus_hosts |
# | RTX-6000 | hpe6k_hosts |
# | RTX-8000 | hpe8k_hosts |
# | V100 | dgx_hosts |
export TBG_gpuType="dgx_hosts"


# Load autocompletion for PIConGPU commands
BASH_COMP_FILE=$PICSRC/bin/picongpu-completion.bash
if [ -f $BASH_COMP_FILE ] ; then
source $BASH_COMP_FILE
else
echo "bash completion file '$BASH_COMP_FILE' not found." >&2
fi