-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add wexac cluster #3613
Add wexac cluster #3613
Conversation
@danlevy100 I think I used your wrong email for the co-authorship. Could you please send me the email address you used for github via e-mail? |
|
||
# "tbg" default options ####################################################### | ||
# currently the submit script, generated by tbg, needs to be streamed to bsub | ||
export TBG_SUBMIT="echo 'manually execute: bsub < '" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does having export TBG_SUBMIT="bsub <"
not work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not well familiar with internals of tbg itself, so do not know it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 👍 - I have no access to test this. @danlevy100 could you test this? Or @psychocoderHPC could you comment whether this could work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this line should be
export TBG_SUBMIT="bsub"
Maybe the line above is generating a valid example but I do not understand why you would do it instead of executing bsub directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@psychocoderHPC We have to do that, because the admins of the wexac cluster, as far as we understood it, prevent using the input file option and only allow to "stream" to bsub
.
@danlevy100 Did that configuration change in the mean time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PrometheusPi @psychocoderHPC No, the configuration did not change.
The only way I could get bsub to submit using a script file is by "bsub < submit.start".
There is some information here: https://www.ibm.com/docs/en/spectrum-lsf/10.1.0?topic=bsub-write-job-scripts
I had to change ~/src/picongpu/tbg
in order to be able to submit with tbg, as mentioned in one of my comments in the #3496 thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbastrakov According to @danlevy100 post here, "bsub <"
should not work. @danlevy100 How did you change tbg
? (Could you provide a diff
?)
645466d
to
074aebd
Compare
Thanks @danlevy100 for sending your email to me. The co-authored commit is fixed. |
etc/picongpu/wexac-weizmann/gpu.tpl
Outdated
.TBG_gpusPerNode=`if [ $TBG_tasks -gt $TBG_numHostedGPUPerNode ] ; then echo $TBG_numHostedGPUPerNode; else echo $TBG_tasks; fi` | ||
|
||
# number of cores to block per GPU - we got 2 cpus per gpu | ||
# and we will be accounted 2 CPUs per GPU anyway |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is wrong, should by 7 CPUs per GPU, based on the variable below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, thanks for catching this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Co-authored-by: danlevy100 <danlevy100@gmail.com>
074aebd
to
4bf059e
Compare
I switched this PR to |
@danlevy100 or Sheroy Tata, are you interested in testing this? |
This pull request concludes the discussion in issue #3496 how to setup PIConGPU in the wexac cluster at Weizmann institute in Israel. Many thanks to @danlevy100 for giving very valuable input. I copied various lines from your example tpl file and thus co-authored you.
@danlevy100 could you test both the
gpu.tpl
andgpu_picongpu.example.profile
?Please note:
Since
tbg
does not (yet) support thebsub
streaming approach of submit files,tbg
will print an information on what to do next. (Or is there an option to do that @psychocoderHPC?)Since there are various combinations of queues and gpu hardware, I avoided generating all MxN configurations via MxN
*.tbl
files. Instead both options can be selected in the profile and are communicated totbg
.run time test on wexac